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POLYKETIDE SYNTHASE ENZYMES AND RECOMBINANT DNA 
CONSTRUCTS THEREFOR 



Cross-Reference to Related Applications 
The present application claims priority to related U.S. patent application Serial 
Nos. 60/102,748, filed 2 Oct. 1998; 60/139,650, filed 17 June 1999; and 60/123,810, filed 
1 1 Mar. 1999, each of which is incorporated herein by reference. 

Field of the Invention 
The present invention relates to polyketides and the polyketide synthase (PKS) 
enzymes that produce them. The invention also relates generally to genes encoding PKS 
enzymes and to recombinant host cells containing such genes and in which expression of 
such genes leads to the production of polyketides. The present invention also relates to 
compounds useful as medicaments having immunosuppressive and/or neurotrophic 
activity. Thus, the invention relates to the fields of chemistry, molecular biology, and 
agricultural, medical, and veterinary technology. 

Background of the Invention 
Polyketides are a class of compounds synthesized from 2-carbon units through a 
series of condensations and subsequent modifications. Polyketides occur in many types of 
organisms, including fungi and mycelial bacteria, in particular, the actinomycetes. 
Polyketides are biologically active molecules with a wide variety of structures, and the 
class encompasses numerous compounds with diverse activities. Tetracycline, 
erythromycin, epothilone, FK-506, FK-520, narbomycin, picromycin, rapamycin, 
spinocyn, and tylosin are examples of polyketides. Given the difficulty in producing 
polyketide compounds by traditional chemical methodology, and the typically low 
production of polyketides in wild-type cells, there has been considerable interest in 
finding improved or alternate means to produce polyketide compounds. 
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This interest has resulted in the cloning, analysis, and manipulation by 
recombinant DNA technology of genes that encode PKS enzymes. The resulting ~" 
technology allows one to manipulate a known PKS gene cluster either to produce the 
polyketide synthesized by that PKS at higher levels than occur in nature or in hosts that 
otherwise do not produce the polyketide. The technology also allows one to produce 
molecules that are structurally related to, but distinct from, the polyketides produced from 
known PKS gene clusters. See, e.g., PCT publication Nos. WO 93/13663; 95/08548; 
96/40968; 97/02358; 98/27203; and 98/493 15; United States Patent Nos. 4,874,748; 
5,063,155; 5,098,837; 5,149,639; 5,672,491; 5,712,146; 5,830,750; and 5,843,718; and 
Fuet al, 1994, Biochemistry 33: 9321-9326; McDaniel etal, 1993, Science 262: 1546- 
1550; and Rohr, 1995, Angew. Chem. Int. Ed. Engl. 34(S): 881-888, each of which is 
incorporated herein by reference. 

Polyketides are synthesized in nature by .PKS enzymes. These enzymes, which are 
complexes of multiple large proteins, are similar to the synthases that catalyze 
condensation of 2-carbon units in the biosynthesis of fatty acids. PKSs catalyze the 
biosynthesis of polyketides through repeated, decarboxylative Claisen condensations 
between acylthioester building blocks. The building blocks used to form complex 
polyketides are typically acylthioesters, such as acetyl, butyryl, propionyl, malonyl, 
hydroxymalonyl, methylmalonyl, and ethylmalonyl CoA. Other building blocks include 
amino acid like acylthioesters. PKS enzymes that incorporate such building blocks 
include an activity that functions as an amino acid ligase (an AMP ligase) or as a non- 
ribosomal peptide synthetase (NRPS). Two major types of PKS enzymes are known; 
these differ in their composition and mode of synthesis of the polyketide synthesized. 
These two major types of PKS enzymes are commonly referred to as Type I or "modular" 
and Type II "iterative" PKS enzymes. 

In the Type I or modular PKS enzyme group, a set of separate catalytic active 
sites (each active site is termed a "domain", and a set thereof is termed a "module") exists 
for each cycle of carbon chain elongation and modification in the polyketide synthesis 
pathway. The typical modular PKS is composed of several large polypeptides, which can 
be segregated from amino to carboxy termini into a loading module, multiple extender 
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modules, and a releasing (or thioesterase) domain. The PKS enzyme known as 6- 
deoxyerythronolide B synthase (DEBS) is a Type I PKS. In DEBS, there is a loading 
module, six extender modules, and a thioesterase (TE) domain. The loading module, six 
extender modules, and TE of DEBS are present on three separate proteins (designated 
DEBS-1, DEBS-2, and DEBS-3, with two extender modules per protein). Each of the 
DEBS polypeptides is encoded by a separate open reading frame (ORF) or gene; these 
genes are known as eryAI, eryAII, and eryAIII. See Caffrey et al., 1 992, FEBS Letters 
304: 205, and U.S. Patent No. 5,824,513, each of which is incorporated herein by 
reference. 

Generally, the loading module is responsible for binding the first building block 
used to synthesize the polyketide and transferring it to the first extender module. The 
loading module of DEBS consists of an acyltransferase (AT) domain and an acyl carrier 
protein (ACP) domain. Another type of loading module utilizes an inactivated 
ketosynthase (KS) domain and AT and ACP domains. This inactivated KS is in some 
instances called KS Q , where the superscript letter is the abbreviation for the amino acid, 
glutamine, that is present instead of the active site cysteine required for ketosynthase 
activity. In other PKS enzymes, including the FK-506 PKS, the loading module 
incorporates an unusual starter unit and is composed of a CoA ligase like activity'domain. 
In any event, the loading module recognizes a particular acyl-CoA (usually acetyl or 
propionyl but sometimes butyryl or other acyl-CoA) and transfers it as a thiol ester to the 
ACP of the loading module. 

The AT on each of the extender modules recognizes a particular extender-Co A 
(malonyl or alpha-substituted malonyl, i.e., methylmalonyl, ethylmalonyl, and 2- 
hydroxymalonyl) and transfers it to the ACP of that extender module to form a thioester. 
Each extender module is responsible for accepting a compound from a prior module, 
binding a building block, attaching the building block to the compound from the prior 
module, optionally performing one or more additional functions, and transferring the 
resulting compound to the next module. 

Each extender module of a modular PKS contains a KS, AT, ACP, and zero, one, 
two, or three domains that modify the beta-carbon of the growing polyketide chain. A 
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typical (non-loading) minimal Type I PKS extender module is exemplified by extender 
module three of DEBS, which contains a KS domain, an AT domain, and an ACP 
domain. These three domains are sufficient to activate a 2-carbon extender unit and attach 
it to the growing polyketide molecule. The next extender module, in turn, is responsible 
for attaching the next building block and transferring the growing compound to the next 
extender module until synthesis is complete. 

Once the PKS is primed with acyl- and malonyl-ACPs, the acyl group of the 
loading module is transferred to form a thiol ester (trans-esterification) at the KS of the 
first extender module; at this stage, extender module one possesses an acyUKS and a 
malonyl (or substituted malonyl) ACP. The acyl group derived from the loading module 
is then covalently attached to the alpha-carbon of the malonyl group to form a carbon- 
carbon bond, driven by concomitant decarboxylation, and generating a new acyl-ACP 
that has a backbone two carbons longer than the loading building block (elongation or 
extension). 

The polyketide chain, growing by two carbons each extender module, is 
sequentially passed as covalently bound thiol esters from extender module to extender 
module, in an assembly line-like process. The carbon chain produced by this process 
alone would possess a ketone at every other carbon atom, producing a polyketone, from 
which the name polyketide arises. Most commonly, however, additional enzymatic 
activities modify the beta keto group of each two carbon unit just after it has been added 
to the growing polyketide chain but before it is transferred to the next module. 

Thus, in addition to the minimal module containing KS, AT, and ACP domains 
necessary to form the carbon-carbon bond, and as noted above, other domains that 
modify the beta-carbonyl moiety can be present. Thus, modules may contain a 
ketoreductase (KR) domain that reduces the keto group to an alcohol. Modules may also 
contain a KR domain plus a dehydratase (DH) domain that dehydrates the alcohol to a 
double bond. Modules may also contain a KR domain, a DH domain, and an 
enoylreductase (ER) domain that converts the double bond product to a saturated single 
bond using the beta carbon as a methylene function. An extender module can also contain 
other enzymatic activities, such as, for example, a methylase or dimethylase activity. 
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After traversing the final extender module, the polyketide encounters a releasing 
domain that cleaves the polyketide from the PKS and typically cyclizes the polyketide. 
For example, final synthesis of 6-dEB is regulated by a TE domain located at the end of 
extender module six. In the synthesis of 6-dEB, the TE domain catalyzes cyclization of 
the macrolide ring by formation of an ester linkage. In FK-506, FK-520, rapamycin, and 
similar polyketides, the TE activity is replaced by a RapP (for rapamycin) or RapP like 
activity that makes a linkage incorporating a pipecolate acid residue. The enzymatic 
activity that catalyzes this incorporation for the rapamycin enzyme is known as RapP, 
encoded by the rapP gene. The polyketide can be modified further by tailoring enzymes; 
these enzymes add carbohydrate groups or methyl groups, or make other modifications, 
i.e., oxidation or reduction, on the polyketide core molecule. For example, 6-dEB is 
hydroxylated at C-6 and C- 12 and glycosylated at C-3 and C-5 in the synthesis of 
erythromycin A. 

In Type I PKS polypeptides, the order of catalytic domains is conserved. When all 
beta-keto processing domains are present in a module, the order of domains in that 
module from N-to-C-terminus is always KS, AT, DH, ER, KR, and ACP. Some or all of 
the beta-keto processing domains may be missing in particular modules, but the order of 
the domains present in a module remains the same. The order of domains within modules 
is believed to be important for proper folding of the PKS polypetides into an active 
complex. Importantly, there is considerable flexibility in PKS enzymes, which allows for 
the genetic engineering of novel catalytic complexes. The engineering of these enzymes 
is achieved by modifying, adding, or deleting domains, or replacing them with those 
taken from other Type I PKS enzymes. It is also achieved by deleting, replacing, or 
adding entire modules with those taken from other sources. A genetically engineered 
PKS complex should of course have the ability to catalyze the synthesis of the product 
predicted from the genetic alterations made. 

Alignments of the many available amino acid sequences for Type I PKS enzymes 
has approximately defined the boundaries of the various catalytic domains. Sequence 
alignments also have revealed linker regions between the catalytic domains and at the N- 
and C-termini of individual polypeptides. The sequences of these linker regions are less 
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well conserved than are those for the catalytic domains, which is in part how linker 
regions are identified. Linker regions can be important for proper association between 
domains and between the individual polypeptides that comprise the PKS complex. One 
can thus view the linkers and domains together as creating a scaffold on which the 
domains and modules are positioned in the correct orientation to be active. This 
organization and positioning, if retained, permits PKS domains of different or identical 
substrate specificities to be substituted (usually at the DNA level) between PKS enzymes 
by various available methodologies. In selecting the boundaries of, for example, an AT 
replacement, one can thus make the replacement so as to retain the linkers of the recipient 
PKS or to replace them with the linkers of the donor PKS AT domain, or, preferably, 
make both constructs to ensure that the correct linker regions between the KS and AT 
domains have been included in at least one of the engineered enzymes. Thus, there is 
considerable flexibility in the design of new PKS enzymes with the result that known 
polyketides can be produced more effectively, and novel polyketides useful as 
pharmaceuticals or for other purposes can be made. 

By appropriate application of recombinant DNA technology, a wide variety of 
polyketides can be prepared in a^yarjtetyjjfjiiffer^^ one has access to 

nucleic acid compounds that encode PKS proteins and polyketide modification enzymes. 
The present invention helps meet the need for such nucleic acid compounds by providing 
recombinant vectors that encode the FK-520 PKS enzyme and various FK-520 
modification enzymes. Moreover, while the FK-506 and FK-520 polyketides have many 
useful activities, there remains a need for compounds with similar useful activities but 
with better pharmacokinetic profile and metabolism and fewer side-effects. The present 
invention helps meet the need for such compounds as well. 

Summary of the Invention 
In one embodiment, the present invention provides recombinant DNA vectors that 
encode all or part of the FK-520 PKS enzyme. Illustrative vectors of the invention 
include cosmid pKOS034-120, pKOS034-124, pKOS065-C31, pKOS065-C3, pKOS065- 
M27, and pKOS065-M21. The invention also provides nucleic acid compounds that 
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encode the various domains of the FK-520 PKS, i.e., the KS, AT, ACP, KR, DH, and ER 
domains. These compounds can be readily used, alone or in combination with nucleic 
acids encoding other FK-520 or non-FK-520 PKS domains, as intermediates in the 
construction of recombinant vectors that encode all or part of PKS enzymes that make 
novel polyketides. 

The invention also provides isolated nucleic acids that encode all or part of one or 
more modules of the FK-520 PKS, each module comprising a ketosynthase activity, an 
acyl transferase activity, and an acyl carrier protein activity. The invention provides an 
isolated nucleic acid that encodes one or more open reading frames of FK-520 PKS 
genes, said open reading frames comprising coding sequences for a CoA ligase activity, 
an NRPS activity, or two or more extender modules. The invention also provides 
recombinant expression vectors containing these nucleic acids. 

In another embodiment, the invention provides isolated nucleic acids that encode 
all or a part of a PKS that contains at least one module in which at least one of the 
domains in the module is a domain from a non-FK-520 PKS and at least one domain is 
from the FK-520 PKS. The non-FK-520 PKS domain or module originates from the 
rapamycin PKS, the FK-506 PKS, DEBS, or another PKS. The invention also provides 
recombinant expression vectors containing these nucleic acids. 

In another embodiment, the invention provides a method of preparing a 



polyketide, said method comprising .transforming a host cell with a recombinant DNA 



vector that encodes at least one module of a PKS, said module comprising at least one 
FK-520 PKS domain, and culturing said host cell under conditions such that said PKS is 
produced and catalyzes synthesis of said polyketide. In one aspect, the method is 
practiced with a Streptomyces hos^cell. In another aspect, the polyketide produced is FK- 
520. In another aspect, the polyketide produced is a polyketide related in structure to FK- 
520. In another aspect, the polyketide produced is a polyketide related in structure to FK- 
506 or rapamycin. 

In another embodiment, the invention provides a set of genes in recombinant form 
sufficient for the synthesis of ethylmalonyl CoA in a heterologous host cell. These genes 
and the methods of the invention enable one to create recombinant host cells with the 
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ability to produce polyketides or other compounds that require ethylmalonyl CoA for 
biosynthesis. The invention also provides recombinant nucleic acids that encode AT 
domains specific for ethylmalonyl CoA. Thus, the compounds of the invention can be 
used to produce polyketides requiring ethylmalonyl CoA in host cells that otherwise are 
unable to produce such polyketides. 

In another embodiment, the invention provides a set of genes in recombinant form 
sufficient for the synthesis of 2-hydroxymalonyl CoA and 2-methoxymalonyl CoA in a 
heterologous host cell. These genes and the methods of the invention enable one to create 
recombinant host cells with the ability to produce polyketides or other compounds that 
require 2-hydroxymalonyl CoA for biosynthesis. The invention also provides 
recombinant nucleic acids that encode AT domains specific for 2-hydroxymalonyl CoA 
and 2-methoxymalonyl CoA. Thus, the compounds of the invention can be used to 
produce polyketides requiring 2-hydroxymalonyl CoA or 2-methoxymalonyl CoA in host 
cells that are otherwise unable to produce such polyketides. 

In another embodiment, the invention provides a compound related in structure to 
FK-520 or FK-506 that is useful in the treatment of a medical condition. These 
compounds include compounds in which the C- 13 methoxy group is replaced by a moiety 
selected from the group consisting of hydrogen, methyl, and ethyl moieties. Such 
compounds are less susceptible to the main in vivo pathway of degradation for FK-520 
and FK-506 and related compounds and thus exhibit an improved pharmacokinetic 
profile. The compounds of the invention also include compounds in which the C-15 
methoxy group is replaced by a moiety selected from the group consisting of hydrogen, 
methyl, and ethyl moieties. The compounds of the invention also include the above 
compounds further modified by chemical methodology to produce derivatives such as, 
but not limited to, the C-l 8 hydroxy 1 derivatives, which have potent neurotrophin but not 
immunosuppresion activities. 
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Thus, the invention provides polyketides having the structure: 




wherein, Rj is hydrogen, methyl, ethyl, or allyl; R 2 is hydrogen or hydroxyl, provided 
that when R 2 is hydrogen, there is a double bond between C-20 and C-19; R 3 is hydrogen 
or hydroxyl; R4 is methoxyl, hydrogen, methyl, or ethyl; and R 5 is methoxyl, hydrogen, 
methyl, or ethyl; but not including FK-506, FK-520, 18-hydroxy-FK-520, and 18- 
hydroxy-FK-506. The invention provides these compounds in purified form and in 
pharmaceutical compositions. 

In another embodiment, the invention provides a method for treating a medical 
condition by administering a pharmaceutically efficacious dose of a compound of the 
invention. The compounds of the invention may be administered to achieve 
immunosuppression or to stimulate nerve growth and regeneration. 

These and other embodiments and aspects of the invention will be more fully 
understood after consideration of the attached Drawings and their brief description below, 
together with the detailed description, examples, and claims that follow. 

Brief Description of the Drawings 
Figure 1 shows a diagram of the FK-520 biosynthetic gene cluster. The top line 
provides a scale in kilobase pairs (kb). The second line shows a restriction map with 
selected restriction enzyme recognition sequences indicated. K is Kpnl; X is Xhol, S is 
Sacl; P is Pstl; and E is EcoRl. The third line indicates the position of FK-520 PKS and 
related genes. Genes are abbreviated with a one letter designation, i.e., C is JkbC. 
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Immediately under the third line are numbered segments showing where the loading 
module (L) and ten different extender modules (numbered 1 - 10) are encoded on the 
various genes shown. At the bottom of the Figure, the DNA inserts of various cosmids of 
the invention (i.e., 34-124 is cosmid pKOS034-124) are shown in alignment with the FK- 
520 biosynthetic gene cluster. 

Figure 2 shows the loading module (load), the ten extender modules, and the 
peptide synthetase domain of the FK-520 PKS, together with, on the top line, the genes 
that encode the various domains and modules. Also shown are the various intermediates 
in FK-520 biosynthesis, as well as the structure of FK-520, with carbons 13, 15, 21, and 
31 numbered. The various domains of each module and subdomains of the loading 
module are also shown. The darkened circles showing the DH domains in modules 2, 3, 
and 4 indicate that the dehydratase domain is not functional as a dehydratase; this domain 
may affect the stereochemistry at the corresponding position in the polyketide. The 
substituents on the FK-520 structure that result from the action of non-PKS enzymes are 
also indicated by arrows, together with the types of enzymes or the genes that code for 
the enzymes that mediate the action. Although the methyltransferase is shown acting at 
the C-13 and C-15 hydroxyl groups after release of the polyketide from the PKS, the 
methyltransferase may act on the 2-hydroxymalonyl substrate prior to or 
contemporaneously with its incorporation during polyketide synthesis. 

Figure 3 shows a close-up view of the left end of the FK-520 gene cluster, which 
contains at least ten additional genes. The ethyl side chain on carbon 21 of FK-520 
(Figure 2) is derived from an ethylmalonyl CoA extender unit that is incorporated by an 
ethylmalonyl specific AT domain in extender module 4 of the PKS. At least four of the 
genes in this region code for enzymes involved in ethylmalonyl biosynthesis. The 
polyhydroxybutyrate depolymerase is involved in maintaining hydroxybutyryl-CoA 
pools during FK-520 production. Polyhydroxybutyrate accumulates during vegetative 
growth and disappears during stationary phase in other Streptornyces (Ranade and 
Vining, 1993, Can. J. Microbiol 59:377). Open reading frames with unknown function 
are indicated with a question mark. 
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Figure 4 shows a biosynthetic pathway for the biosynthesis of ethylmalonyl CoA 
from acetoacetyl CoA consistent with the function assigned to four of the genes in the 
FK-520 gene cluster shown in Figure 3. 

Figure 5 shows a close-up view of the right-end of the FK-520 PKS gene cluster 
(and of the sequences on cosmid pKOS065-C31). The genes shown include JkbD,JkbM 
(a methyl transferase that methylates the hydroxy 1 group on C-3 1 of FK-520\flcbN (a 
homolog of a gene described as a regulator of cholesterol oxidase and that is believed to 
be a transcriptional activator),yJW>0 (a type II thioesterase, which can increase polyketide 
production levels), and flcbS (a crotonyl-CoA reductase involved in the biosynthesis of 
ethylmalonyl CoA). 

Figure 6 shows the proposed degradative pathway for tacrolimus (FK-506) 
metabolism. 

Figure 7 shows a schematic process for the construction of recombinant PKS 
genes of the invention that encode PKS enzymes that produce 13-desmethoxy FK-506 
and FK-520 polyketides of the invention, as described in Example 4, below. 

Figure 8, in Parts A and B, shows certain compounds of the invention preferred 
for dermal application in Part A and a synthetic route for making those compounds in 
PartB. 

Detailed Description of the Invention 
Given the valuable pharmaceutical properties of polyketides, there is a need for 
methods and reagents for producing large quantities of polyketides, as well as for 
producing related compounds not found in nature. The present invention provides such 
methods and reagents, with particular application to methods and reagents for producing 
the polyketides known as FK-520, also known as ascomycin or L-683,590 (see Holt et 
a/., 1993, JACS 775:9925), and FK-506, also known as tacrolimus. Tacrolimus is a 
macrolide immunosuppressant used to prevent or treat rejection of transplanted heart, 
kidney, liver, lung, pancreas, and small bowel allografts. The drug is also useful for the 
prevention and treatment of graft- versus-host disease in patients receiving bone marrow 
transplants, and for the treatment of severe, refractory uveitis. There have been additional 
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reports of the unapproved use of tacrolimus for other conditions, including alopecia 
universalis, autoimmune chronic active hepatitis, inflammatory bowel disease, multiple 
sclerosis, primary biliary cirrhosis, and scleroderma. The invention provides methods and 
reagents for making novel polyketides related in structure to FK-520 and FK-506. and 
structurally related polyketides such as rapamycin. 

The FK-506 and rapamycin polyketides are potent immunosuppressants, with 
chemical structures shown below. 




FK-506 Rapamycin 

FK-520 differs from FK-506 in that it lacks the allyl group at C-21 of FK-506, having 
instead an ethyl group at that position, and has similar activity to FK-506, albeit reduced 
immunosuppressive activity. 

These compounds act through initial formation of an intermediate complex with 
protein "immunophilins" known as FKBPs (FK-506 binding proteins), including FKBP- 
12. Immunophilins are a class of cytosolic proteins that form complexes with molecules 
such as FK-506, FK-520, and rapamycin that in turn serve as ligands for other cellular 
targets involved in signal transduction. Binding of FK-506, FK-520, and rapamycin to 
FKBP occurs through the structurally similar segments of the polyketide molecules, 
known as the "FKBP-binding domain" (as generally but not precisely indicated by the 
stippled regions in the structures above). The FK-506-FKBP complex then binds 
calcineurin, while the rapamycin-FKBP complex binds to a protein known as RAFT-1 . 
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Binding of the FKBP-polyketide complex to these second proteins occurs through the 
dissimilar regions of the drugs known as the "effector" domains. 



The three component FKBP-polyketide-effector complex is required for signal 
transduction and subsequent immunosuppressive activity of FK-506, FK-520, and 
rapamycin. Modifications in the effector domains of FK-506, FK-520, and rapamycin 
that destroy binding to the effector proteins (calcineurin or RAFT) lead to loss of 
immunosuppressive activity, even though FKBP binding is unaffected. Further, such 
analogs antagonize the immunosuppressive effects of the parent polyketides, because 
they compete for FKBP. Such non-immunosuppressive analogs also show reduced 
toxicity (see Dumont et al., 1992, Journal of Experimental Medicine 1 76, 75 1 -760), 
indicating that much of the toxicity of these drugs is not linked to FKBP binding. 

In addition to immunosuppressive activity, FK-520, FK-506, and rapamycin have 
neurotrophic activity. In the central nervous system and in peripheral nerves, 
immunophilins are referred to as "neuroimmunophilins". The neuroimmunophilin FKBP 
is markedly enriched in the central nervous system and in peripheral nerves. Molecules 
that bind to the neuroimmunophilin FKBP, such as FK-506 and FK-520, have the 
remarkable effect of stimulating nerve growth. In vitro, they act as neurotrophins, i.e., 
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they promote neurite outgrowth in NGF-treated PC 12 cells and in sensory neuronal 
cultures, and in intact animals, they promote regrowth of damaged facial and sciatic 
nerves, and repair lesioned serotonin and dopamine neurons in the brain. See Gold et a/., 
Jun. 1999, 1 Pharm. Exp. Ther. 289(3): 1202-1210; Lyons et a/., 1994, Proc. National 
Academy of Science 91: 3191-3195; Gold etal. y 1995, Journal of Neuroscience 15: 7509- 
7516; and Steiner et al. % 1997, Proc. National Academy of Science 94: 2019-2024. 
Further, the restored central and peripheral neurons appear to be functional. 

Compared to protein neurotrophic molecules (BNDF, NGF, etc.), the small- 
molecule neurotrophins such as FK-506, FK-520, and rapamycin have different, and 
often advantageous, properties. First, whereas protein neurotrophins are difficult to 
deliver to their intended site of action and may require intra-cranial injection, the small- 
molecule neurotrophins display excellent bioavailability; they are active when 
administered subcutaneously and orally. Second, whereas protein neurotrophins show 
quite specific effects, the small-molecule neurotrophins show rather broad effects. 
Finally, whereas protein neurotrophins often show effects on normal sensory nerves, the 
small-molecule neurotrophins do not induce aberrant sprouting of normal neuronal 
processes and seem to affect damaged nerves specifically. Neuroimmunophilin ligands 
have potential therapeutic utility in a variety of disorders involving nerve degeneration 
(e.g. multiple sclerosis, Parkinson's disease, Alzheimer's disease, stroke, traumatic spinal 
cord and brain injury, peripheral neuropathies). 

Recent studies have shown that the immunosuppressive and neurite outgrowth 
activity of FK-506, FK-520, and rapamycin can be separated; the neuroregenerative 
activity in the absence of immunosuppressive activity is retained by agents which bind to 
FKBP but not to the effector proteins calcineurin or RAFT. See Steiner et al. 9 1997, 
Nature Medicine 3: 421-428. 

>> Nerve Regeneration 
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Available structure-activity data show that the important features for neurotrophic 
activity of rapamycin, FK-520, and FK-506 lie within the common, contiguous segments 
of the macrolide ring that bind to FKBP. This portion of the molecule is termed the 
"FKBP binding domain" (see VanDuyne et al. 9 1993, Journal of Molecular Biology 229: 
105-124.). Nevertheless, the effector domains of the parent macrolides contribute to 
conformational rigidity of the binding domain and thus indirectly contribute to FKBP 
binding. 




"FKBP binding domain" 

There are a number of other reported analogs of FK-506, FK-520, and rapamycin that 
bind to FKBP but not the effector protein calcineurin or RAFT. These analogs show 
effects on nerve regeneration without immunosuppressive effects. 

Naturally occurring FK-520 and FK-506 analogs include the antascomycins, 
which are FK-506-like macrolides that lack the functional groups of FK-506 that bind to 
calcineurin (see Fehr et al, 1996, The Journal of Antibiotics 49: 230-233). These 
molecules bind FKBP as effectively as does FK-506; they antagonize the effects of both 
FK-506 and rapamycin, yet lack immunosuppressive activity. 
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Antascomycin A 

Other analogs can be produced by chemically modifying FK-506, FK-520, or 
rapamycin. One approach to obtaining neuroimmunophilin ligands is to destroy the 
effector binding region of FK-506, FK-520, or rapamycin by chemical modification. 
While the chemical modifications permitted on the parent compounds are quite limited, 
some useful chemically modified analogs exist. The FK-520 analog L-685,818 (ED 5 o = 
0.7 nM for FKBP binding; see Dumont et a/., 1992), and the rapamycin analog WAY- 
124,466 (IC50 =12.5 nM; see Ocain et aL, 1993, Biochemistry Biophysical Research 
Communications 192: 1340-134693) are about as effective as FK-506, FK-520, and 
rapamycin at promoting neurite outgrowth in sensory neurons (see Steiner et aL, 1997). 




One of the few positions of rapamycin that is readily amenable to chemical 
modification is the allylic 16-methoxy group; this reactive group is readily exchanged by 
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- acid-catalyzed nucleophilic substitution. Replacement of the 16-methoxy group of 

rapamycin with a variety of bulky groups has produced analogs showing selective loss of 
immunosuppressive activity while retaining FKBP-binding (see Luengo et al. 9 1995, 
Chemistry & Biology 2: 471-481). One of the best compounds, 1, below, shows complete 
loss of activity in the splenocyte proliferation assay with only a 10-fold reduction in 
binding to FKBP. 




There are also synthetic analogs of FKBP binding domains. These compounds 
reflect an approach to obtaining neuroimmunophilin ligands based on "rationally 
designed" molecules that retain the FKBP-binding region in an appropriate conformation 
for binding to FKBP, but do not possess the effector binding regions. In one example, the 
ends of the FKBP binding domain were tethered by hydrocarbon chains (see Holt et al, 
1993, Journal of the American Chemical Society 115: 9925-9938); the best analog, 2, 
below, binds to FKBP about as well as FK-506. In a similar approach, the ends of the 
FKBP binding domain were tethered by a tripeptide to give analog 3, below, which binds 
to FKBP about 20-fold poorer than FK-506. These compounds are anticipated to have 
neuroimmunophilin binding activity. 
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2 3 



In a primate MPTP model of Parkinson's disease, administration of FKBP ligand 
GPI-1046 caused brain cells to regenerate and behavioral measures to improve. MPTP is 
a neurotoxin, which, when administered to animals, selectively damages nigral-striatal 
dopamine neurons in the brain, mimicking the damage caused by Parkinson's disease. 
Whereas, before treatment, animals were unable to use affected limbs, the FKBP ligand 
restored the ability of animals to feed themselves and gave improvements in measures of 
locomotor activity, neurological outcome, and fine motor control. There were also 
corresponding increases in regrowth of damaged nerve terminals. These results 
demonstrate the utility of FKBP ligands for treatment of diseases of the CNS. 

From the above description, two general approaches towards the design of non- 
immunosuppressant, neuroimmunophilin ligands can be seen. The first involves the 
construction of constrained cyclic analogs of FK-506 in which the FKBP binding domain 
is fixed in a conformation optimal for binding to FKBP. The advantages of this approach 
are that the conformation of the analogs can be accurately modeled and predicted by 
computational methods, and the analogs closely resemble parent molecules that have 
proven pharmacological properties. A disadvantage is that the difficult chemistry limits 
the numbers and types of compounds that can be prepared. The second approach involves 
the trial and error construction of acyclic analogs of the FKBP binding domain by 
conventional medicinal chemistry. The advantages to this approach are that the chemistry 
is suitable for production of the numerous compounds needed for such interactive 
chemistry-bioassay approaches. The disadvantages are that the molecular types of 
compounds that have emerged have no known history of appropriate pharmacological 
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properties, have rather labile ester functional groups, and are too conformational ly mobile 
to allow accurate prediction of conformational properties. 

The present invention provides useful methods and reagents related to the first 
approach, but with significant advantages. The invention provides recombinant PKS 
genes that produce a wide variety of polyketides that cannot otherwise be readily 
synthesized by chemical methodology alone. Moreover, the present invention provides 
polyketides that have either or both of the desired immunosuppressive and neurotrophic 
activities, some of which are produced only by fermentation and others of which are 
produced by fermentation and chemical modification. Thus, in one aspect, the invention 
provides compounds that optimally bind to FKBP but do not bind to the effector proteins. 
The methods and reagents of the invention can be used to prepare numerous constrained 
cyclic analogs of FK-520 in which the FKBP binding domain is fixed in a conformation 
optimal for binding to FKBP. Such compounds will show neuroimmunophilin binding 
(neurotrophic) but not immunosuppressive effects. The invention also allows direct 
manipulation of FK-520 and related chemical structures via genetic engineering of the 
enzymes involved in the biosynthesis of FK-520 (as well as related compounds, such as 
FK-506 and rapamycin); similar chemical modifications are simply not possible because 
of the complexity of the structures. The invention can also be used to introduce "chemical 
handles" into normally inert positions that permit subsequent chemical modifications. 

Several general approaches to achieve the development of novel 
neuroimmunophilin ligands are facilitated by the methods and reagents of the present 
invention. One approach is to make "point mutations" of the functional groups of the 
parent FK-520 structure that bind to the effector molecules to eliminate their binding 
potential. These types of structural modifications are difficult to perform by chemical 
modification, but can be readily accomplished with the methods and reagents of the 
invention. 

A second, more extensive approach facilitated by the present invention is to 
utilize molecular modeling to predict optimal structures ab initio that bind to FKBP but 
not effector molecules. Using the available X-ray crystal structure of FK-520 (or FK-506) 
bound to FKBP, molecular modeling can be used to predict polyketides that should 
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optimally bind to FKBP but not calcineurin. Various macrolide structures can be 
generated by linking the ends of the FKBP-binding domain with "all possible" polyketide 
chains of variable length and substitution patterns that can be prepared by genetic 
manipulation of the FK-520 or FK-506 PKS gene cluster in accordance with the methods 
of the invention. The ground state conformations of the virtual library can be determined, 
and compounds that possess binding domains most likely to bind well to FKBP can be 
prepared and tested. 

Once a compound is identified in accordance with the above approaches, the 
invention can be used to generate a focused library of analogs around the lead candidate, 
to "fine tune" the compound for optimal properties. Finally, the genetic engineering 
methods of the invention can be directed towards producing "chemical handles" that 
enable medicinal chemists to modify positions of the molecule previously inert to 
chemical modification. This opens the path to previously prohibited chemical 
optimization of lead compounds by time-proven approaches. 

Moreover, the present invention provides polyketide compounds and the 
recombinant genes for the PKS enzymes that produce the compounds that have 
significant advantages over FK-506 and FK-520 and their analogs. The metabolism and 
pharmacokinetics of tacrolimus has been exstensively studied, and FK-520 is believed to 
be similar in these respects. Absorption of tacrolimus is rapid, variable, and incomplete 
from the gastrointestinal tract (Harrison's Principles of Internal Medicine, 14th edition, 
1998, McGraw Hill, 14, 20, 21, 64-67). The mean bioavailability of the oral dosage form 
is 27%, (range 5 to 65%). The volume of distribution (VolD) based on plasma is 5 to 65 
L per kg of body weight (L/kg), and is much higher than the VolD based on whole blood 
concentrations, the difference reflecting the binding of tacrolimus to red blood cells. 
Whole blood concentrations may be 12 to 67 times the plasma concentrations. Protein 
binding is high (75 to 99%), primarily to albumin and alphal-acid glycoprotein. The half- 
life for distribution is 0.9 hour; elimination is biphasic and variable: terminal- 1 1 .3 hr 
(range, 3.5 to 40.5 hours). The time to peak concentration is 0.5 to 4 hours after oral 
administration. 
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Tacrolimus is metabolized primarily by cytochrome P450 3A enzymes in the liver 
and small intestine. The drug is extensively metabolized with less than 1% excreted 
unchanged in urine. Because hepatic dysfunction decreases clearance of tacrolimus, doses 
have to be reduced substantially in primary graft non- function, especially in children. In 
addition, drugs that induce the cytochrome P450 3 A enzymes reduce tacrolimus levels, 
while drugs that inhibit these P450s increase tacrolimus levels. Tacrolimus bioavailability 
doubles with co-administration of ketoconazole, a drug that inhibits P450 3 A. See, 
Vincent et ai, 1992, In vitro metabolism of FK-506 in rat, rabbit, and human liver 
microsomes: Identification of a major metabolite and of cytochrome P450 3A as the 
major enzymes responsible for its metabolism, Arch. Biochem. Biophys. 294: 454-460; 
Iwasaki et al, 1993, Isolation, identification, and biological activities of oxidative 
metabolites of FK-506, a potent immunosuppressive macrolide lactone, Drug Metabolism 
& Disposition 21: 971-977; Shiraga et aL, 1994, Metabolism of FK-506, a potent 
immunosuppressive agent, by cytochrome P450 3A enzymes in rat, dog, and human liver 
microsomes, Biochem. Pharmacol. 47: 727-735; and Iwasaki et al, 1995, Further 
metabolism of FK-506 (Tacrolimus); Identification and biological activities of the 
metabolites oxidized at multiple sites of FK-506, Drug Metabolism & Disposition 23: 28- 
34. The cytochrome P450 3 A subfamily of isozymes has been implicated as important in 
this degradative process. 

Structures of the eight isolated metabolites formed by liver microsomes are shown 
in Figure 6. Four metabolites of FK-506 involve demethylation of the oxygens on 
carbons 13, 15, and 31, and hydroxylation of carbon 12. The 13-demethylated (hydroxy) 
compounds undergo cyclizations of the 13-hydroxy at C-10 to give MI, MVI and MVII, 
and the 12-hydroxy metabolite at C-10 to give I. Another four metabolites formed by 
oxidation of the four metabolites mentioned above were isolated by liver microsomes 
from dexamethasone treated rats. Three of these are metabolites doubly demethylated at 
the methoxy groups on carbons 15 and 31 (M-V), 13 and 31 (M-VI), and 13 and 15 (M- 
VII). The fourth, M-VIII, was the metabolite produced after demethylation of the 31- 
methoxy group, followed by formation of a fused ring system by further oxidation. 
Among the eight metabolites, M-II has immunosuppressive activity comparable to that of 
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FK-506, whereas the other metabolites exhibit weak or negligible activities. Importantly, 
the major metabolite of human, dog, and rat liver microsomes is the 13-demethylated and 
cyclized FK-506 (M-I). 

Thus, the major metabolism of FK-506 proceeds via 13-demethylation followed 
by cyclization to the inactive M-I, this representing about 90% of the metabolic products 
after a 10 minute incubation with liver microsomes. Analogs of tacrolimus that do not 
possess a C-13 methoxy group would not be susceptible to the first and most important 
biotransformation in the destructive metabolism of tacrolimus (i.e. cyclization of 13- 
hydroxy to C-10). Thus, a 13-desmethoxy analog of FK-506 should have a longer half- 
life in the body than does FK-506. The C-13 methoxy group is believed not to be 
required for binding to FKBP or calcineurin. The C-13 methoxy is not present on the 
identical position of rapamycin, which binds to FKBP with equipotent affinity as 
tacrolimus. Also, analysis of the 3-dimensional structure of the FKBP-tacrolimus- 
calcineurin complex shows that the C-13 methoxy has no interaction with FKBP and only 
a minor interaction with calcineurin. The present invention provides C- 13-desmethoxy 
analogs of FK-506 and FK-520, as well as the recombinant genes that encode the PKS 
enzymes that catalyze their synthesis and host cells that produce the compounds. 

These compounds exhibit, relative to their naturally occurring counterparts, 
prolonged immunosuppressive action in vivo, thereby allowing a lower dosage and/or 
reduced frequency of administration. Dosing is more predictable, because the variability 
in FK-506 dosage is largely due to variation of metabolism rate. FK-506 levels in blood 
can vary widely depending on interactions with drugs that induce or inhibit cytochrome 
P450 3A (summarized in USP Drug Information for the Health Care Professional). Of 
particular importance are the numerous drugs that inhibit or compete for CYP 3A, 
because they increase FK-506 blood levels and lead to toxicity (Prograf package insert, 
FujisawaDUS, Rev 4/97, Rec 6/97). Also important are the drugs that induce P450 3A 
(e.g. Dexamethasone), because they decrease FK-506 blood levels and reduce efficacy. 
Because the major site of CYP 3A action on FK-506 is removed in the analogs provided 
by the present invention, those analogs are not as susceptible to drug interactions as the 
naturally occurring compounds. 
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Hyperglycemia, nephrotoxicity, and neurotoxicity are the most significant adverse 
effects resulting from the use of FK-506 and are believed to be similar for FK-520. 
Because these effects appear to occur primarily by the same mechanism as the 
immunosuppressive action (i.e. FKBP-calcineurin interaction), the intrinsic toxicity of the 
desmethoxy analogs may be similar to FK-506. However, toxicity of FK-506 is dose 
related and correlates with high blood levels of the drug (Prograf package insert, 
FujisawaDUS, Rev 4/97, Rec 6/97). Because the levels of the compounds provided by 
the present invention should be more controllable, the incidence of toxicity should be 
significantly decreased with the 13-desmethoxy analogs. Some reports show that certain 
FK-506 metabolites are more toxic than FK-506 itself, and this provides an additional 
reason to expect that a C YP 3 A resistant analog can have lower toxicity and a higher 
therapeutic index. 

Thus, the present invention provides novel compounds related in structure to FK- 
506 and FK-520 but with improved properties. The invention also provides methods for 
making these compounds by fermentation of recombinant host cells, as well as the 
recombinant host cells, the recombinant vectors in those host cells, and the recombinant 
proteins encoded by those vectors. The present invention also provides other valuable 
materials useful in the construction of these recombinant vectors that have many other 
important applications as well. In particular, the present invention provides the FK-520 
PKS genes, as well as certain genes involved in the biosynthesis of FK-520 in 
recombinant form. 

FK-520 is produced at relatively low levels in the naturally occurring cells, 
Streptomyces hygroscopicus var. ascomyceticus, in which it was first identified. Thus, 
another benefit provided by the recombinant FK-520 PKS and related genes of the 
present invention is the ability to produce FK-520 in greater quantities in the recombinant 
host cells provided by the invention. The invention also provides methods for making 
novel FK-520 analogs, in addition to the desmethoxy analogs described above, and 
derivatives in recombinant host cells of any origin. 

The biosynthesis of FK-520 involves the action of several enzymes. The FK-520 
PKS enzyme, which is composed of the fkbA,fkbB,fkbC, and JkbP gene products, 
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synthesizes the core structure of the molecule. There is also a hydroxylation at C-9 
mediated by the P450 hydroxylase that is the JkbD gene product and that is oxidized by 
XhefkbO gene product to result in the formation of a keto group at C-9. There is also a 
methylation at C-3 1 that is mediated by an O-methy transferase that is the fkbM gene 
product. There are also methylations at the C-13 and C-15 positions by a 
methyltransferase believed to be encoded by the fkbG gene; this methyltransferase may 
act on the hydroxymalonyl CoA substrates prior to binding of the substrate to the AT 
domains of the PKS during polyketide synthesis. The present invention provides the 
genes encoding these enzymes in recombinant form. The invention also provides the 
genes encoding the enzymes involved in ethylmalonyl CoA and 2-hydroxymalonyl CoA 
biosynthesis in recombinant form. Moreover, the invention provides Streptomyces 
hygroscopicus var. ascomyceticus recombinant host cells lacking one or more of these 
genes that are useful in the production of useful compounds. 

The cells are useful in production in a variety of ways. First, certain cells make a 
useful FK-520-related compound merely as a result of inactivation of one or more of the 
FK-520 biosynthesis genes. Thus, by inactivating the C-31 O-methyltransferase gene in 
Streptomyces hygroscopicus var. ascomyceticus, one creates a host cell that makes a 
desmethyl (at C-31) derivative of FK-520. Second, other cells of the invention are unable 
to make FK-520 or FK-520 related compounds due to an inactivation of one or more of 
the PKS genes. These cells are useful in the production of other polyketides produced by 
PKS enzymes that are encoded on recombinant expression vectors and introduced into 
the host cell. 

Moreover, if only one PKS gene is inactivated, the ability to produce FK-520 or 
an FK-520 derivative compound is restored by introduction of a recombinant expression 
vector that contains the functional gene in a modified or unmodified form. The 
introduced gene produces a gene product that, together with the other endogenous and 
functional gene products, produces the desired compound. This methodology enables one 
to produce FK-520 derivative compounds without requiring that all of the genes for the 
PKS enzyme be present on one or more expression vectors. Additional applications and 
benefits of such cells and methodology will be readily apparent to those of skill in the art 
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after consideration of how the recombinant genes were isolated and employed in the 
construction of the compounds of the invention. 

The FK-520 biosynthetic genes were isolated by the following procedure. 
Genomic DNA was isolated from Streptomyces hygroscopicus var. ascomyceticus 
(ATCC 14891) using the lysozyme/proteinase K protocol described in Genetic 
Manipulation of Streptomyces - A Laboratory Manual (Hopwood et a/., 1986). The 
average size of the DNA was estimated to be between 80- 120 kb by electrophoresis on 
0.3% agarose gels. A library was constructed in the SuperCos™ vector according to the 
manufacturer's instructions and with the reagents provided in the commercially available 
kit (Stratagene). Briefly, 100 jig of genomic DNA was partially digested with 4 units of 
Sau3A I for 20 min. in a reaction volume of 1 mL, and the fragments were 
dephosphorylated and ligated to SuperCos vector arms. The ligated DNA was packaged 
and used to infect log-stage XLl-BlueMR cells. A library of about 10,000.independent 
cosmid clones was obtained. 

Based on recently published sequence from the FK-506 cluster (Motamedi and 
Shafiee, 1998, Eur J. Biochem. 256: 528), a probe for the JkbO gene was isolated from 
ATCC 14891 using PCR with degenerate primers. With this probe, a cosmid designated 
pKOS034-124 was isolated from the library. With probes made from the ends of cosmid 
pKOS034-124, an additional cosmid designated pKOS034-120 was isolated. These 
cosmids (pKOS034-124 and pKOS034-120) were shown to contain DNA inserts that 
overlap with one another. Initial sequence data from these two cosmids generated 
sequences similar to sequences from the FK-506 and rapamycin clusters, indicating that 
the inserts were from the FK-520 PKS gene cluster. Two EcoRI fragments were 
subcloned from cosmids pKOS034-124 and pKOS034-120. These subclones were used 
to prepare shotgun libraries by partial digestion with Sau3 AI, gel purification of 
fragments between 1.5 kb and 3 kb in size, and ligation into the pLitmus28 vector (New 
England Biolabs). These libraries were sequenced using dye terminators on a Beckmann 
CEQ2000 capillary electrophoresis sequencer, according to the manufacturer's protocols. 

To obtain cosmids containing sequence on the left and right sides of the 
sequenced region described above, a new cosmid library of ATCC 14891 DNA was 
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prepared essentially as described above. This new library was screened with a nev/flcbM 
probe isolated using DNA from ATCC 14891 . A probe representing the fkbP gene at the 
end of cosmid pKOS034-124 was also used. Several additional cosmids to the right of the 
previously sequenced region were identified. Cosmids pKOS065-C31 and pKOS065-C3 
were identified and then mapped with restriction enzymes. Initial sequences from these 
cosmids were consistent with the expected organization of the cluster in this region. More 
extensive sequencing showed that both cosmids contained in addition to the desired 
sequences, other sequences not contiguous to the desired sequences on the host cell 
chromosomal DNA. Probing of additional cosmid libraries identified two additional 
cosmids, pKOS065-M27 and pKOS065-M21, that contained the desired sequences in a 
contiguous segment of chromosomal DNA. Cosmids pKOS034-124, pKOS034-120, 
pKOS065-M27, and pKOS065-M21 have been deposited with the American Type 
Culture Collection, Manassas, VA, USA. The complete nucleotide sequence of the 
coding sequences of the genes that encode the proteins of the FK-520 PKS are shown 
below but can also be determined from the cosmids of the invention deposited with the 
ATCC using standard methodology. 

Referrmgto Figures 1 and 3, the FK-520 PKS gene cluster is composed of four 
open reading frames designated JkbB,flcbC,flcbA, and JkbP. The JkbB open reading frame 
encodes the loading module and the first four extender modules of the PKS. The fkbC 
open reading frame encodes extender modules five and six of the PKS. The JkbA open 
reading frame encodes extender modules seven, eight, nine, and ten of the PKS. The fkbP 
open reading frame encodes the NRPS of the PKS. Each of these genes can be isolated 
from the cosmids of the invention described above. The DNA sequences of these genes 
are provided below preceded by thV following table identifying the start and stop codons 
of the open reading frames of each ge\e and the modules and domains contained therein. 

Nucleotides 

complement (412 - 1836) 
complement (2020-3579) 
complement (3969 - 4496) 
complement (4595 - 5488) 
5601 -6818 




or Domain 



fkbW y 
fkbV 
JkbR2 
JkbRl 

JkbE 
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m 



M 



9883) 
10994) 

- 11247) 

- 12092) 
- 13150) 

- 23988) 

- 46573) 



■ 76(202) 

• 770L80) 

■ 7753^) 

• 4657 

■ 44629) 

■ 43660) 

■ 43093) 
41842) 

■ 40609) 

• 39307) 
•38581) 

38296) 
•37144) 

• 35749) 

• 34480) 
33715) 
33439) 
32185) 
31018) 
29740) 
28960) 
28684) 
27430) 
26146) 
24373) 
23892) 
22653) 
21420) 
20097) 
19326) 





6808 \ 8052 
8156-^824 
complement (9122 - 
complement (9894 - 
5 complement (10987 
complemen\(l 1244 
complement Xl 2113 
complement (K5212 
complement (2'. 

10 46754-47788 
47785 - 52272 
52275 - 71465 
71462 - 72628 
72625 - 73407 

1 5 complement (73460 
complement (76336 
complement (77076 
complement (44974 
complement (43777 

20 complement (43 1 44 
complement (41 842 
complement(40609 - 
complement (39442 
complement (38677 

25 complement (38371 
complement (37145 
complement (35749 
complement (34606 
complement (33823 

3 0 complement (33505 
complement (32185 
complement (31018 
complement (29869 
complement (29092 

35 complement (28750 
complement (27430 
complement (26146 
complement (24997 
complement (24163 

40 complement (22653 
complement (2 1 420 
complement (20241 
complement (19464 
complement (19116 



fkl 
flcbM 
flcbN 

fkbQ 
fkbS 

CoA ligase of loading domain 
ER of loading domain 
ACP of loading domain 
KS of extender module 1 (KS1) 
ATI 
DH1 
1 

Ap>l 
KS 

AT2\ 

DH2 inactive) 
KR2 
ACP2 
KS3 
AT3 

DH3 (inacti\e) 
KR3 
ACP3 
KS4 
AT4 

DH4 (inactive) 
ACP4 
KS5 
AT5 
DH5 
KR5 
ACP5 
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coniblement (17820 
complement (16587 
complement (15438 
complentent (14517 
complement (13761 
complement 3452 
52362 - 535 
53577- 5471 
54717-55871 
56019- 56819 
7$mi - 57575 
^7710^57929) 
57990 - 59243 
59244 - 60398 
60399-61412 
61548 -62180 
62328 - 62537 
62598 - 63854 
63855 -65084 
65085 - 66254 
66399-67175 
67299-67931 
68094 - 68303 
68397 - 69653 
69654 - 70985 
71064-71273 



1 

61 
121 
181 
241 
301 
361 
421 
481 
541 
601 
661 
721 
781 
841 
901 
961 
1021 
1081 
1141 
1201 



GATCTCAGGC 
TGTACGGACC 
TTACAAGATC 
GAAAGGGCGC 
ACCGTCACCT 
ACGCTGAACA 
TACGGGGAGG 
GAGACGGCAC 
GTTCGCGGGC 
GTGACACGGC 
CGTGCCGTCC 
GCGTACACGT 
CAGCGGCTTG 
GGAGCGGGTG 
CGGTGTGCCG 
CTGCGTCAGA 
GAACCCGGCG 
GGTGGGGTAG 
CCACAGGGTG 
CCAGCGCACG 
GTGGTAGCGC 



ATGAAGTCCT 
ACTTCAGTCA 
CTCACATTGC 
GGGCGGTCCG 
CTCTCCCCCG 
CCCGCGCGGT 
GCGTACGGCG 
TCGGCGAGCA 
GGGCGGTGGC 
AGCAAAGGCC 
TCGATGCGGT 
CGGAGCCCGG 
CCGATACGAC 
GCGTAGTCGT 
GCTTCCTTCT 
TCCCAGTAGA 
CGGAGCAGCG 
TCGCGCAGGG 
CCTTCCCAGT 
AGGTAGCCGC 
TGGGCGACCG 



CCAGGCGAGG 
GCGGCGATTG 
GCGACCGCCA 
CACCAGGGCG 
CCGGCGGGAT 
GTGGCGTCGG 
GCCGTGGCTC 
GGGACGCCTG 
CGGTGGTGAG 
GGAGTCGGTC 
AGTAGCGGTA 
GCGGCAGGCA 
CGGTCAACGC 
AGTCGGCATC 
CCCCATCGAA 
CCTCGTGGTG 
CCTCGCGCGC 
CGGCCGGCAG 
CGACTCCTCC 
CGTTGGACAT 
ACGCGCGGGC 



CGCCGAGGTG 
CGGAACCAAG 
GCATACGCTG 
GAGTACGCGA 
GCCCGGCGTG 
GGACACCGCC 
GTGCTCACGG 
GTCGGCACCT 
CCAGCTCTCC 
GGGGAAGGTG 
CCGGCCGCCA 
GCAGCACGTC 
GATGCGTTCC 
GChGCCCGGG 
GCCGGGGTCG 
GTACGGCCAC 
CTGGCCGGCT 
GAAGGTGAAG 
GTCGTACAGC 
CCCGGTGACC 
GGCCCGGGTC 



GTGAACACCT 
TCATCCGGAA 
AGTTGCCTCA 
CGAGAGTGGC 
ACACGGTTGG 
TGGCATCGGC 
CCGCCGGGCG 
GCGGGCCGGA 
AGGGCGGTGA 
TCGACGAGGG 
GGCCGCTGCC 
GAGAGTGCCT 
ACGGCCGCGT 
ACCGTCCCCG 
AACTCCTCGC 
AAGAACTCGG 
GCGGGGCCGC 
AGGTTGGGAC 
TCGGGATGGT 
AGGGTGCGCT 
AGCTGGGTGA 



CGCCGCTGCT 
TAAAGGGCGG 
GAGGCAAACC 
GCACCCGCGC 
GCTCTCCTCG 
CGGGTGACGG 
GTCATCCGTC 
CGACCGTGTG 
AGGCTGAGCG 
CGTCGGTGTG 
G G AC AT AC GC 
GGATGGTGAT 
GGACGCCGGA 
GGGCGCAATA 
GGTAGACGCG 
AGTCGGCCGG 
CTGCCGCGTA 
CCTCCGCGCG 
TCTCCAGCTG 
CGAGCGGCCG 
GGCGGGTGTT 
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12 61 CCACTCGGCG ACGGCGTCGC CCGGCCGGGA 
1321 GCCCTTGTCG GTGGCGGCGT AGGCGTAACC 
1381 GTCGTTGGCG TACTGCTCGC GGTTACCGGG 
14 41 GCGGTCGGGC AGCCGGATGA CGAACTGGGC 
1501 GGTGGAGGTG TCGGGGAAGT AGCCGTCGAT 
1561 CAGGTTCTTG GGCGTCAGCC CTGCCCAGTC 
1621 TCCCGCCGTG GTCAGCTCGT CCAGGCAGTC 
1681 CAGCTGGGAC AGACGGGCGC AGTGACCGTC 

17 41 CGGTGAGGGG AGCAGGACGG CGACTGCGGC 
1801 TCTCGGGGCC CGTCCGACAC CGAGGGGCAG 

18 61 GATGACGGAC TGGAGGCTAG GTCGCGCACG 
1921 ACTGAGGCCC CTCAGAGGTG GGCCGCCGCC 
1981 GGCGGTGCCC GCGGCCGCCA CCGGTTCCGG 
2041 GACGGTGAAG TAGCCGGTCG GCGACTCTTT 
2101 GCCCATGTTC TGGCCGGAGC CCTTGGCGTA 
2161 CGCCTGGACG TGAGCGTAGT TGCCGGCGGT 
2221 CGCGGTGACC GCGCCCGAGA GCGGTCCGGC 
2281 GTAGGTGTGC GATGTGCCCG CCCTCAGGCC 
2341 GGTGATCTGG GCACCGTCGC GGTGGACGGC 
24 01 GGTCAGGCTG ATGGTGGTGT CGGTGGCGCC 
24 61 CGAACCGGGG TCGGAGGCGG ATCCGCTCAG 
2521 ACAGATCGAG TCCAGGAAGT AGGCGGCGCC 
2581 GGGATCGACC GGGGTGCCGT GCCCGATGCC 
2 641 TCCGTCCGCG GCCAGGTACT CCTCGTGCCG 
27 01 GTCCGGCGTC TGGGACACGC CGTGCACAGC 

27 61 GCGCGGCGCG ACGGTGGTGT CCTTGTCGCC 
2821 CGACCACGAG GGGTAGCCGT CACGGACCCG 

28 81 CCCGGGGTTC ATGCACAGGT ACGCGCTGCT 
2 941 GGCGACGACC GCGCCGGCCT GGAAGACGTC 
3001 GGCACCGCCG GCGGACAGCC CGGTGATGTA 
3061 GACGGTGTGA GCGGCCATCT GCCGGATCGA 
3121 GCTGCTCTGG AACCAGTTGA AGCACCTGTT 
3181 CACGAGCAGG AAGCCATAGC GGTCCGCGAA 
3241 CTGGGCGTCC TGGGTGCAAC CGTGCAGGGC 
3301 CGCGGGCCGG TAGACGTACA TGTTCAGCCG 
3361 GGTCAGGTCC GCCTTGGTCA GACCGGGCTT 
3421 CGCCGGGCCG AGCAGGGCCG CTCCGAGTAC 
34 81 CACCCCCCGC CGTCCCGGAC GCGACAACGA 
3541 CAGCGGGGTG AGGATTCCCC GGAACGGCGG 
3601 GGGGGGACAC GGAGGGCTCC CTGACGTCGA 
3661 TAGGGGTGGT TCAACCCGCA ACGGTATGGC 
3721 TGCGCCCGGA CGGATTGTGT CGCCTTGCGG 

37 81 ACCCGACACG GGTAGGGCGT CATGGTGTCC 

38 41 ACGGACCGGG CGTCGGCGGA CCGGGCGTCG 
3901 CCAGCCGCGT GGGGCGGCCG CGCCCAAGTG 
3961 CGGACCGGTC AGTGCAGTCC CGCGGCCCTG 
4 021 GCGGCGAACC GGGGTCCGTG TCCGCGGCGG 
4 081 ACGATGACAC CGTCCTGGTT GTAGCCGATG 
4141 CGGCTGGCGG ACTCCCGGGT GTTCAGGACC 
4 201 AAGACCGGGT TCGGCAGCCT GACCCGGTCC 
4 261 ATGTCGGTGA CGCTCTGCCC GGTGACCAGG 
4 321 TTGCCCCAGG TGGTGCCCGC CGAGTAGTGG 
4 381 GTCAGGAGCG TGAGCCAGGA GTTGTCGGTC 
4 441 TACACGTCGC CGGTGGTGAA GTCCTCGAAG 



GCCATCACGG TAGAACGCGG GGCCGGTGTT 
GCGGGCGAGC ACCCAGTCGG CGATGGCCCG 
GGTGCCGGCC ACGACCAGGC CACCGTTCCA 
GTCGTGGTTC CACCCGTGGT TGGTGTTGGT 
CTGGATCCCG GGCACTCCGG TGGGAGTGGC 
CGCCGGGTCG GTGTGGCCGG TGGCCGCCGT 
GGCCTGCTGA CGTGCCGCCG CCGGGACACG 
CGGGGCATCG GGAGCAGGCC GGGCCGTGGC 
CAGGGTGAGA GCGCCGAGGC CGGTGCGTCT 
AACCATGGAG AGCCTCCAGA CGTGCGGATG 
GTGGAGACGA ACATGGGTGC GCCCGCCATG 
ATGACGGGCG CGGGACCGCG GGCGCTCCGG 
GTCCCCGGGT CAGGGACAGG TGTCGTTCGC 
CAAGGTGGTC GTGACGAAGG TGTTGTACAG 
GGTGTAACCG GCGCTCGTCG TGGCGCGGCC 
CCAGCAGACG GCCGTGGCAC CGGTCGTCTG 
CTTGCCGTCC GCGTCCCGGG CGGCGACCGC 
GGTGTCCGTG TACGACGTCG TGGCGGACGT 
GTAGTCGGTG GCGCCGTCGA CGGGTTTCCA 
GGTGGCGGCC AGGCCGGACG GAGCGGGCAG 
GCCGAAGAAC TGCGTGATCC AGTAGCTGGA 
GGTGCTGCCG CACTGCTGTG CTCCGGTGCC 
CGGCACCCGG TTCACCTCCA CGGCCACCGA 
GGTGGAGTTC GGGCCGATCA CCGAGGTACG 
GGTCCACTGG TCGCGCAACT CGTCGGCGTT 
GTGCCAGATG GCCACGCGCG GCCACGGGCC 
CCGCGCCCAC TGGTCCGCGG TCAGGTCGGT 
GACGTCGGTG GCACAGCCGA AGGGCAGGCC 
CGGATAGGTG GCGAGCATCA CCGACGTCAT 
GGTGCGCTGG GGGTCCGCGC CGTAGGCGGA 
CGCGGCTTCG CCCTGGCCCC TGCGGTTGTC 
CGCGTTGTTC GACGACGTGG TCTCGGCGAA 
TGAGAGCAGG CCGGAGTTGT CGGCGTAGCC 
GAACACCACC GCCGGCTCCG CGGGCAGGGA 
GCCCGGGTTC GTGCCGAAGT CCGCGACCTC 
GGCCAGGCCC GCCGCGGCGT GGGCCGTCGG 
GAGGGCCACG ACGGCCACGA GACGGGTGAG 
CCCGACCGGC GGCGAGGAGG AGAGGGGGAA 
CGGCTGCATG GCGGCTCCCT CGATGTCGTG 
TCAGTGGGAG CGCCCCGGTG CCCGGCACCG 
CCGGAGCACC ACACCCCGCA CCGCGCGATG 
AATCTGATAC CCGGACGCGA CGAACGCCCC 
GACTCGGCCG GTCGGCCTTG CCTGCCCTGG 
GCGGGCTGGG CGGTATGGCG GCCGAGGACG 
CAGTACGCCG ACCGTGGCCG GCGGGAGGGC 
CGGGACCGCT CGTCCCAGAC GGGTTCCACC 
TAGACCATCA GTGTCCGCTC GAAGGTGATG 
GTGCGCACGC TGATGATGCC TACGTCAGGT 
TCGGACTGCG AGTAGATGGT GTCGCCCTCG 
CAGCCGAGGT TGGCCATCAC ATGCTGGGAG 
GCGAGGGTGA AGGTGGAGTC CACCAGCGGC 
CGGTCGAAGT GCAGCGGCGC GGTGTTCTGC 
TCCAGGACCG TGCGGCCCAG GGGGTGGCGG 
TAGCGGCCCT GCCAGCCCTC GACCACAGCG 
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4 501 GTGCGGGTGG CGTCCTGGTC CGGGTTCTCA GTCGTCATGG CGCTCATTCT GGGAAGTCCC 
4 561 CGGTCCGCTG TGAAATGCCG AACCTTCACC GGGCTCATAC GTGCGGCGCA TGAGCCCTGG 
4 621 ACCGTACGTA GTCGTAGAAC CTCGCCACCA CTGGCGCGCG TGGTCCTCCG GCGAGTGTGA 
4 681 CCACGCCGAC CGTGCGCCGC GCCTGCGGGT CGTCGAGCGG CACGGCGACG GCGTGGTCAC 
47 41 CGGGCCCGGA CGGGCTGCCG GTGAGGGGGG CGACGGCCAC ACCGAGGCCG GCGGCGACCA 
4 801 GGGCCCGCAG CGTGCTCAGC TCGGTGCTCT CCAGGACGAC CCGCGGCACG AATCCGGCCG 
4 8 61 CGGCGCACAG CCGGTCGGTG ATCTGGCGCA GTCCGAAGAC CGGCTCCAGT GCCACGAACG 
4 921 CCTCATCGGC CAGCTCCGCG GTCCGCACCC GGCGGCGTCT GGCCAGCCGG TGTCCGGGTG 

4 981 GGACGAGCAG GCACAGTGCC TCGTCCCGCA GTGGTGTCCA CTCCACATCG TCCCCGGCGG 
5041 GTCGTGGGCT GGTCAGCCCC AGGTCCAGCC TGCTGTTGCG GACGTCGTCG ACCACGGCGT 
5101 CGGCGGCGTC GCCGCGCAGT TCGAAGGTGG TGCCGGGAGC CAGCCGGCGG TACCCGGCGA 
5161 GGAGGTCGGG CACCAGCCAG GTGCCGTAGG AGTGCAGGAA ACCCAGTGCC ACGGTGCCGG 
5221 TGTCGGGGTC GATCAGGGCG GTGATGCGCT GCTCGGCGCC GGAGACCTCA CTGATCGCGC 
5281 GCAGGGCGTG GGCGCGGAAG ACCTCGCCGT ACTTGTTGAG CCGGAGCCGG TTCTGGTGCC 
5341 GGTCGAACAG CGGCACGCCC ACTCGTCGCT CCAGCCGCCG GATGGCCCTG GACAGGGTCG 
54 01 GCTGGGAGAT GTTGAGCCGT TCCGCGGTGA TCGTCACGTG CTCGTGCTCG GCCAAGGCCG 
54 61 TGAACCACTG CAACTCCCGT ATCTCCATGC AGGGACTATA CGTACCGGGC ATGGTCCTGG 
5521 CGAGGTTTCG TCATTTCACA GCGGCCGGGC GGCGGCCCAC AGTGAGTCCT CACCAACCAG 
5581 GACCCCATGG GAGGGACCCC ATGTCCGAGC CGCATCCTCG CCCTGAACAG GAACGCCCCG 

5 641 CCGGGCCCCT GTCCGGTCTG CTCGTGGTTT CTTTGGAGCA GGCCGTCGCC GCTCCGTTCG 
57 01 CCACCCGCCA CCTGGCGGAC CTGGGCGCCC GTGTCATCAA GATCGAACGC CCCGGCAGCG 
57 61 GCGACCTCGC CCGCGGCTAC GACCGCACGG TGCGTGGCAT GTCCAGCCAC TTCGTCTGGC 
5821 TGAACCGGGG GAAGGAGAGC GTCCAGCTCG ATGTGCGCTC GCCGGAGGGC AACCGGCACC 
5881 TGCACGCCTT. GGTGGACCGG GCCGATGTCC TGGTGCAGAA TCTGGCACCC GGCGCCGCGG 
5941-GCCGCCTGGC ATCGGCCACC AGGTCCTCGC GCGGAGCCAC CGAGGCTGAT CACCTGCGGA 
6001 CATATCCGGC TACGGCAGTA CCGGCTGCTA CCGCGGACCG CAAGGCGTAC GACCTCCTGG 
6061 TCCAGTGCGA AGCGGGGCTG GTCTCCATCA CCGGCACCCC CGAGACCCCG TCCAAGGTGG 
6121 GCCTGTCCAT CGCGGACATC TGTGCGGGGA TGTACGCGTA CTCCGGCATC CTCACGGCCC 
6181 TGCTGAAGCG GGCCCGCACC GGCCGGGGCT CGCAGTTGGA GGTCTCGATG CTCGAAGCCC 
6241 TCGGTGAATG GAT GGG AT AC GCCGAGTACT ACACGCGCTA CGGCGGCACC GCTCCGGCCC 
6301 GCGCCGGCGC CAGCCACGCG ACGATCGCCC CCTACGGCCC GTTCACCACG CGCGACGGGC 
6361 AGACGATCAA TCTCGGGCTC CAGAACGAGC GGGAGTGGGC TTCCTTCTGC GGTGTCGTGC 
6421 TACAACGCCC CGGTCTCTGC GACGACCCGC GCTTTTCCGG CAACGCCGAC CGGGTGGCGC 
64 81 ACCGCACCGA GCTCGACGCC CTGGTGAGCG AGGTGACGGG CACGCTCACC GGCGAGGAAC 
6541 TGGTGGCGCG GCTGGAGGAG GCGTCGATCG CCTACGCACG CCAGCGCACC GTGCGGGAGT 
6601 TCAGCGAACA CCCCCAACTG CGTGACCGTG GACGCTGGGC TCCGTTCGAC AGCCCGGTCG 
6661 GTGCGCTGGA GGGCCTGATC CCCCCGGTCA CCTTCCACGG CGAGCACCCG CGGCGGCTGG 
6721 GCCGGGTCCC GGAGCTGGGC GAGCATACCG AGTCCGTCCT GGCGTGGCTG GCCGCGCCCC 
67 81 ACAGCGCCGA CCGCGAAGAG GCCGGCCATG CCGAATGAAC TCACCGGAGT CCTGATCCTG 
6841 GCCGCCGTGT TCCTGCTCGC CGGCGTACGG GGGCTGAACA TGGGCCTGCT CGCGCTGGTC 
6901 GCCACCTTTC TGCTCGGGGT GGTCGCACTC GACCGAACGC CGGACGAGGT GCTGGCGGGT 
6961 TTCCCCGCGA GCATGTTCCT GGTGCTGGTC GCCGTCACGT TCCTCTTCGG GATCGCCCGC 
7 021 GTCAACGGCA CGGTGGACTG GCTGGTACGT GTCGCGGTGC GGGCGGTGGG GGCCCGGGTG 
7 081 GGAGCCGTCC CCTGGGTGCT CTTCGGCCTG GCGGCACTGC TCTGCGCGAC AGGCGCGGCC 
7141 TCGCCCGCGG CGGTGGCGAT CGTGGCGCCG ATCAGCGTCG CGTTCGCCGT CAGGCACCGC 
7201 ATCGATCCGC TGTACGCCGG ACTGATGGCG GTGAACGGGG CCGCAGCCGG CAGTTTCGCC 
7 261 CCCTCCGGGA TCCTGGGCGG CATCGTCCAC TCGGCGCTGG AGAAGAACCA TCTGCCCGTC 
7321 AGCGGCGGGC TGCTCTTCGC AGGCACCTTC GCCTTCAACC TGGCGGTCGC CGCGGTGTCA 
7 381 TGGCTCGTCC TCGGGCGCAG GCGCCTCGAA CCACATGACC TGGACGAGGA CACCGATCCC 
74 41 ACGGAAGGGG ACCCGGCTTC CCGCCCCGGC GCGGAACACG TGATGACGCT GACCGCGATG 
7 501 GCCGCGCTGG TGCTGGGAAC CACGGTCCTC TCCCTGGACA CCGGCTTCCT GGCCCTCACC 
7 561 TTGGCGGCGT TGCTGGCGCT GCTCTTCCCG CGCACCTCCC AGCAGGCCAC CAAGGAGATC 
7 621 GCCTGGCCCG TGGTGCTGCT GGTATGCGGG ATCGTGACCT ACGTCGCCCT GCTCCAGGAG 
7 681 CTGGGCATCG TGGACTCCCT GGGGAAGATG ATCGCGGCGA TCGGCACCCC GCTGCTGGCC 
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77 41 GCCCTGGTGA TCTGCTACGT GGGCGGTGTC GTCTCGGCCT TCGCCTCGAC CACCGGGATC 
7801 CTCGGTGCCC TGATGCCGCT GTCCGAGCCG TTCCTGAAGT CCGGTGCCAT CGGGACGACC 
7 8 61 GGCATGGTGA TGGCCCTGGC GGCCGCGGCG ACCGTGGTGG ACGCGAGTCC CTTCTCCACC 
7 921 AATGGTGCTC TGGTGGTGGC CAACGCTCCC GAGCGGCTGC GGCCCGGCGT GTACCAGGGG 

7 981 TTGCTGTGGT GGGGCGCCGG GGTGTGCGCA CTGGCTCCCG CGGCCGCCTG GGCGGCCTTC 
8041 GTGGTGGCGT GAGCGCAGCG GAGCGGGAAT CCCCTGGAGC CCGTTTCCCG TGCTGTGTCG 
8101 CTGACGTAGC GTCAAGTCCA CGTGCCGGGC GGGCAGTACG CCTAGCATGT CGGGCATGGC 
8161 TAATCAGATA ACCCTGTCCG ACACGCTGCT CGCTTACGTA CGGAAGGTGT CCCTGCGCGA 
8221 TGACGAGGTG CTGAGCCGGC TGCGCGCGCA GACGGCCGAG CTGCCGGGCG GTGGCGTACT 
8281 GCCGGTGCAG GCCGAGGAGG GACAGTTCCT CGAGTTCCTG GTGCGGTTGA CCGGCGCGCG 
8341 TCAGGTGCTG GAGATCGGGA CGTACACCGG CTACAGCACG CTCTGCCTGG CCCGCGGATT 
84 01 GGCGCCCGGG GGCCGTGTGG TGACGTGCGA TGTCATGCCG AAGTGGCCCG AGGTGGGCGA 
84 61 GCGGTACTGG GAGGAGGCCG GGGTTGCCGA CCGGATCGAC GTCCGGATCG GCGACGCCCG 
8521 GACCGTCCTC ACCGGGCTGC TCGACGAGGC GGGCGCGGGG CCGGAGTCGT TCGACATGGT 
8581 GTTCATCGAC GCCGACAAGG CCGGCTACCC CGCCTACTAC GAGGCGGCGC TGCCGCTGGT 

8 641 ACGCCGCGGC GGGCTGATCG TCGTCGACAA CACGCTGTTC TTCGGCCGGG TGGCCGACGA 
8701 AGCGGTGCAG GACCCGGACA CGGTCGCGGT ACGCGAACTC AACGCGGCAC TGCGCGACGA 
87 61 CGACCGGGTG GACCTGGCGA TGCTGACGAC GGCCGACGGC GTCACCCTGC TGCGGAAACG 
8821 GTGACCGGGG CGATGTCGGC GGCGGTCAGC GTCAGCGTCG TCGGCGCGGG CCTCGCGGAG 
8881 GGCTCCAGAT GCAGGCGTTC GACGCCGGCG GCGGAAGCGC CCGCCACCTC GGACACGCAG 
8 941 GGGCAGTCGG AGTCCGCGAA GCCCGCGAAC CGGTAGGCGA TCTCCATCAT GCGGTTGCGG 
9001 TCCGTACGCC GGAAGTCCGC CACCAGGTGC GCCCCCGCGC GGGCGCCCTG GTCCGTGAGC 
9061 CAGTTCAGGA TCGTCGCACC GGCACCGAAC GACACGACCC GGCAGGACGT GGCGAGCAGT 
9121 TTGAGGTGCC ACGTCGACGG CTTCTTCTCC AGCAGGATGA TGCCGACGGC GCCGTGCGGG 
9181 CCGAAGGGGT CGCCCATGGT GACGACGAGG ACCTCATGGG CGGGATCGGT GAGCACGCGC 
9241 GCAGGTCGGC GTCGGAGTAG TGCACGCCGG TCGCGTTCAT CTGGCTGGTC CGCAGCGTCA 
9301 GTTCCTCGAC GCGGCTGAGT TCCTCCTCCC CCGCGGGTGC GATCGTCATG GAGAGGTCGA 
9361 GCGAGCGCAG GAAGTCCTCG TCGGGACCGG AGTACGCCTC CCGGGCCTGG TCGCGCGCGA 
9421 AACCCGCCTG GTACATCAGG CGGCGCCGAC GCGAGTCGAC CGTGGACACC GGCGGGCTGA 
9481 ACTCCGGCAG CGACAGGAGC GTGGCCGCCT GCTCGGCCGG GTAGCACCGC ACCTCGGGCA 
9541 GGTGGAACGC CACCTCGGCA CGCTCGGCGG GCTGGTCGTC GATGAACGCG ATCGTGGTCG 
9601 GTGCGAAGTT CAGCTCCGTG GCGATCTCGC GGACGGACTG CGACTTCGGC CCCCATCCGA 
9661 TGCGGGCCAG CACGAAGTAC TCCGCCACAC CGAGGCGTTC CAGACGCTCC CACGCGAGGT 
9721 CGTGGTCGTT CTTGCTCGCC ACCGCCTGGA GGATGCCGCG GTCGTCGAGC GTGGTGATCA 
9781 CCTCGCGGAT CTCGTCGGTG AGGACCACCT CGTCGTCCTC CAGCACGGTG CCCCGCCACA 
9841 AGGTGTTGTC CAGGTCCCAG ACCAGACACT TGACAATGGT CATGGCTGTC CTCTCAAGCC 
9901 GGGAGCGCCA GCGCGTGCTG GGCCAGCATC ACCCGGCACA TCTCGCTGCT GCCCTCGATG 
9961 ATCTCCATGA GCTTGGCGTC GCGGTACGCC CGTTCGACGA CGTGTCCCTC TCTCGCGCCT 

10021 GCCGACGCGA GCACCTGTGC GGCGGTCGCG GCCCCGGCGG CGGCTCGTTC GGCGGCGACG 
10081 TGCTTGGCCA GGATCGTCGC GGGCACCATC TCGGGCGAGC CCTCGTCCCA GTGGTCGCTG 
10141 GCGTACTCGC ACACGCGGGC CGCGATCTGC TCCGCGGTCC ACAGGTCGGC GATGTGCCCG 
10201 GCGACGAGTT GGTGGTCGCC GAGCGGCCGG CCGAACTGCT CCCGGGTCCG GGCGTGGGCC 
10261 ACCGCGGCGG TGCGGCAGGC CCGCAGGATC CCGACGCAGC CCCAGGCGAC CGACTTGCGC 
10321 CCGTAGGCGA GTGACGCCGC GACCAGCATC GGCAGTGACG CGCCGGAGCC GGCCAGGACC 
10381 GCGCCGGCCG GCACACGCAC CTGGTCCAGG TGCAGATCGG CGTGGCCGGC GGCGCGGCAG 
10441 CCGGACGGCT TCGGGACGCG CTCGACGCGT ACGCCGGGGG TGTCGGCGGG CACGACCACC 
10501 ACCGCACCGG AACCATCCTC CTGGAGACCG AAGACGACCA GGTGGTCCGC GTAGGCGGCG 
10561 GCAGTCGTCC AGACCTTGTG GCCGTCGACG ACAGCGGTGT CCCCGTCGAG CCGAACCCGC 
10621 GTCCGCATCG CCGACAGATC GCTGCCCGCC TGCCGCTCAC TGAAGCCGAC GGCCGCGAGT 
10681 TTCCCGCTGG TCAGCTCCTT CAGGAAGGTC GCCCGCTGAC CGGCGTCGCC GAGCCGCTGC 

107 41 ACGGTCCACG CGGCCATGCC CTGCGACGTC ATGACACTGC GCAGCGAACT GCAGAGGCTG 
10801 CCGACGTGTG CGGTGAACTC GCCGTTCTCC CGGCTGCCGA GTCCCAGACC GCCGTGCTCG 

108 61 GCCGCCACTT CCGCGCAGAG CAGGCCGTCG GCGCCGAGCC GGACGAGCAG GTCGCGCGGC 
10921 AGTTCGCCGG ACGTGTCCCA CTCGGCGGCC CGGTCACCGA CAAGGTCGGT CAGCAGCGCG 
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10981 TCACGCTCAG GCATCGACGG CCCGCAGCCG GTGGACGAGT GCGACCATGG ACTCGACGGT 
11041 ACGGAAGTTC GCGAGCTGGA GGTCCGGGCC GGCGATCGTG ACGTCGAACG TCTTCTCCAG 
11101 GTACACGACC AGTTCCATCG CGAACAGCGA CGTGAGGCCG CCCTCCGCGA ACAGGTCGCG 
11161 GTCCACGGGC CAGTCCGACC TGGTCTTCGT. CTTGAGGAAC GCGACCAACG CGTGCGCGAC 
11221 GGGGTCGTCC TTGACGGGTG CGGTCATGAG AACACCTTCT CGTATTCGTA GAAGCCCCGG 
11281 CCGGTCTTCC GGCCGTGGTG TCCCTCGCGG ACCTTGCCCA GCAGCAGGTC ACAGGGGCGG 
11341 CTGCGCTCGT CGCCGGTGCG TTTGTGCAGC ACCCACAGCG CGTCGACGAG GTTGTCGATG 
114 01 CCGATCAGGT CCGCGGTGCG CAGCGGCCCG GTCGGATGGC CGAGGCACCC CGTCATGAGC 
114 61 GCGTCGACGT CCTCGACGGA CGCGGTGCCC TCCTGCACGA TCCGCGCCGC GTCGTTGATC 
11521 ATCGGGTGGA GCAGCCGGCT CGTGACGAAG CCGGGCGCGT CCCGGACGAC GATCGGCTTG 
11581 CGCCGCAGCG CCGCGAGCAG GTCCCCGGCG GCGGCCATGG CCTTCTCACC GGTCCGGGGT 
11641 CCGCGGATCA CCTCGACCGT CGGGATCAGG TACGACGGGT TCATGAAGTG CGTGCCGAGC 
11701 AGGTCCTCGG GCCGGGCCAC GGAGTCGGCC AGTTCGTCAA CCGGGATCGA CGACGTGTTC 
117 61 GTGATGACCG GGATACCGGG CGCCGCTGCC GAGACCGTGG CGAGTACCTC CGCCTTGACC 
11821 TCGGCGTGCT CGACGACGGC CTCGATCACC GCGGTGGCCG TACCGATCGC GGGCAGCGCG 
11881 GACGTGGCCG TCCGCAGCAC ACCGGGGTCG GCCTCGGCGG GCCCGGCCAC GAGTTGTGCC 
11941 GTCCGCAGTT CGGTGGCGAT CCGCGCCCGC GCCGCCGTAA GGATCTCCTC GGACGTGTCG 
12001 ACGAGTGTCA CCGGGACGCC GTGGCGCAGC GCGAGCGTGG TGATGCCGGT GCCCATCACT 
12061 CCCGCGCCGA GCACGATCAG CTGGTGGTCC ACGCTGTTTC CTCCCTCCGG GGTCACCATG 
12121 GCAGCGAGTA CGGGTCGAGG ACGTCTTCCG GGGTCGACCC GATCGCGTCC TTGCGGCCGA 
12181 GGCCGAGTTC GTCGGCGAAG CCGAGCAGCA CGTCGAACGC GATGTGGTCG GCGAACGCGC 
12241 TGCCCGTCGA GTCGAGGACG CTCAGGCTGT CCCGGTGGTC CGCCGCGGTG TCCGGTGCCG 
12301 CGCACAGGGC CGCCAGCGAC GGGCCGAGCT CGCGGTCCGG CAGTTGCTGG TACTCGCCCT 
12361 CGGCGCGGGC CTGCCCCGGA TGGTCGACGC AGATGAACGC GTCGTCGAGC AGGGTCTTCG 
12421 GCAGTTCGGT CTTGCCCGGC TCGTCGGCGC CGATGGCGTT CACATGCAGG TGCGGCAGCC 
124 81 GCGGCTCGGC GGGCAGCACC GGCCCTTTGC CCGAGGGCAC CGAGGTGACG GTGGACAGGA 
12541 CATCCGCGGC GGCGGCGGCC TCCGCCGGAT. CGGTCACCTT GACCGGCAGT CCGAGGAACG 
12601 CGATGCGGTC CGCGAACGAC GCCGCGTGGC CGGGGTCGGT GTCGCTGACC AGGATCCGCT 
12661 CGATGGGCAG GACCCTGCTG AGCGCGTGCG CCTGGGTCAC CGCCTGTGCG CCCGCGCCGA 
12721 TCAGCGTGAG CGTGGCGCTG TCGGACCGGG CCAGCAGCCG GCTCGCGACG GCGGCGACCG 
12781 CGCCGGTCCG CATCGCGGTG ATCACGCCTG CGTCGGCGAG GGCGGTCAGA CTGCCGCTGT 
12841 CGTCGTCGAG GCGCGACATC GTGCCGACGA TCGTCGGCAG CCGGAAGCGC GGATAGTTGT 
12 901 GCGGACTGTA CGAAACCGTC TTCATGGTCA CGCCGACACC GGGGACCCGG TACGGCATGA 
12961 ACTCGATGAC GCCGGGAATG TCGCCGCCGC GGACGAATCC GGTACGCGGC GGCGCCTCGG 
13021 CGAACTCGCC GCGGCCGAGC GCGGCGAACC CGTCGTGCAG CTCGCTGATC AGCCGGTCCA 
13081 TCATCACGTC GCGGCCGATC ACGGAGAGAA TCCGCTTGAT GTCACGTTGG CGCAGGACCC 
13141 TGGTCTGCAT GTGTCACCTC CCTTTCGTGG CCGGAGCTGT CTTGGTGGTG CCGCTCGGGG 
13201 CGGCTTCCGT TCTCATCGCA GCTCCCTGTC GATGAGGTCG AAAATCTCGT CCGCGGTCGC 
13261 GTCCGCGGAC AGCACGCCGG CCGGCGTGGT CGGGCGGGTC TCCCGCCGCC AGCGGTTGAG 
13321 CAGGGCGTCC AGCCGGGTTC CGATCGCGTC CGCCTGGCGG GCGCCCGGGT CGACACCGGC 
13381 AACGAGTGCT TCCAGCCGGT CGAGCTGCGC GAGCACCACG GTCACCGGGT CGTCCGGGGA 
134 41 CAGCAGTTCA CCGATGCGGT CGGCGAGTGC GCGCGGCGAC GGGTAGTCGA AGACGAGCGT 
13501 GGCGGACAGT CGCAGACCGG TCGCCTCGTT GAGGCCGTTG CGCAGCTGCA CCGCGATGAG 
13561 CGAGTCCACA CCGAGTTCCC GGAACGCCGC GTCCTCCGGG ATGTCCTCCG GGTCGGCGTG 
13621 GCCCAGGACG GCCGCTGCCT TCTGCCGGAC GAGGGCGAGC AGGTCGGTGG GGCGTTCCTG 
13681 CTCGTTGCGG GCGCTCCGGC GGGCCGACGG CTTGGGCCGG CCACGCAGCA GCGGGAGGTC 
13741 CGGCGGChGG TCGCCCGCCA CGGCGACGAC ACTGCCCGTT CCGGTGTGGA CGGCGGCGTC 
13801 GTACATGCGC ATGCCCTGTT CGGCGGTGAG CGCGCTCGCC CCACCCTTGC GCATACGGCG 
13861 CCGGTCGGCG TCGGTCAGGT CCGCGGTCAG GCCACTCGCC TGGTCCCACA GCCCCCACGC 
13921 GATCGACAGC CCTGGCAGCC CTTGTGCACG CCGGTGTTCG GCGAGCGCGT CGAGGAACGC 
13981 GTTCGCCGCC GCGTAGTTGC CCTGACCGGG GGTGCCCAGC ACACCGGCCG CCGACGAGTA 
14 041 GACGACGAAT GCGGCGAGGT CGGTGTCGCG GGTGAGCCGG TGCAGGTGCC AGGCGGCGTC 
14101 GGCCTTGGGT TTGAGGACGG TGTCGATGCG GTCGGGGGTG AGGTTGTCGA GCAGGGCGTC 
14161 GTCGAGGGTT CCGGCGGTGT GGAAGACGGC GGTGAGGGGT TGAGGGATGT GGGCGAGGGT 
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14221 GGTGGCGAGT TGGTGGGGGT CGCCGACGTC GCAGGGGAGG TGGGTGCCGG GGGTGGTGTC 
14281 GGGGGGTGGG GTGCGGGAGA GGAGGTAGGT GTGGGGGTGG TTCAGGTGGC GGGCGAGGAT 
14 341 GCCGGCGAGG GTGCCGGAGC CGCCGGTGAT GACGACGGCC CCCTCGGGGT CCAGCGGCCG 
14 401 CGGGACCGTG AGGACGATCT TGCCGGTGTG CTCGCCGCGG CTCATGGTCG CCAGCGCCTC 
5 14 4 61 GCGGACCTGC CGCATGTCGT GCACCGTCAC CGGCAGCGGG TGCAGCACAC CGCGCGCGAA 
14521 CAGGCCGAGC AGCTCCGCGA TGATCTCCTT GAGCCGGTCG GGCCCCGCGT CCATCAGGTC 
14581 GAACGGTCGC TGGACGGCGT GCCGGATGTC CGTCTTCCCC ATCTCGATGA ACCGGCCACC 
14 641 CGGCGCGAGC AGGCCGACGG ACGCGTCGAG GAGTTCACCG GTGAGCGAGT TGAGCACGAC 
14701 GTCGACCGGC GGGAACGCGT CGGCGAACGC GGTGCTGCGG GAATCGGCCA GATGCGCTCC 
10 14 7 61 GTCCAGGTCC ACCAGATGGC GCTTCGCGGC GCTGGTGGTC GCGTACACCT CCGCGCCCAG 
14 821 GTGCCGCGCG ATCTGCCGGG CGGCGGAACC GACACCGCCG GTGGCCGCGT GGATCAGGAC 
14 881 CTTCTCGCCG GGGCGCAGCC CGGCGAGGTC GACCAGGCCG TACCACGCGG TCGCGAACGC 
14 941 GGTCATCACG GACGCCGCCT GCGGGAACGT CCAGCCGTCC GGCATCCGGC CGAGCATCCG 
15001 GTGGTCGGCG ATGACCGTGG GGCCGAAGCC GGTGCCGACG AGGCCGAAGA CGCGGTCGCC 
15 15061 CGGTGCCAGA CCGGAGACGT CGGCGCCGGT CTCCAGGACG ATGCCCGCGG CCTCGCCGCC 
15121 GAGCACGCCC TGACCGGGGT AGGTGCCGAG CGCGATCAGC ACATCGCGGA AGTTGAGGCC 
15181 CGCCGCACGC ACACCGATCC GGACCTCGGC CGGGGCGAGG GGGCGCCGGG GCTCCGCCGA 
^ 15241 GTCGGCCGCG GTGAGGCCGT CGAGGGTGCC CGTCCGCGCC GGCCGGATCA GCCACGTGTC 

O 15301 GCTGTCCGGC ACGGTGAGCG GCTCCGGCAC CCGGGTGAGG CGGGCCGCCT CGAACCGGCC 

y3 20 15361 GCCGCGCAGC CGCAGACGCG GCTCGCCGAG TGCGACGGCG ATGCGCTGCT GCTCGGGGGC 



C 15421 GAGCGTGACG CCGGACTCGG TCTCGACGTG GACGAACCGG CCGGGCTGCT CGGCCTGGGC 

£ 15481 GGCGCGCAGC AGTCCGGCCG CGGCGCCGGT GGCGAGGCCC GCGGTGGTGT GCACGAGCAG 

p 15541 ATCCCCGCCG GAGCCGGTCA GGGCGGTCAG CAGCCGGGTG GTGAGCGCAC GCGTCTCGGC 

U\ 15601 CACCGGGTCG TCGCCATCAG CGGCAGGCAA CGTGATGACG TCCACGTCGG. TCGCGGGGAC 



jp[ ■ 25. 15661 ATCCGTGGGT GCGGCGACCT CGATCCAGGT .GAGACGCATC AGGCCGGTGC CGACGGGTGG 
!yj 15721 GGACAGCGGG CGGGTGCGGA CCGTCCGGAT CTCGGCGACG AGTTGGCCGG CGGAGTCGGC 

y 15781 GACGCGCAGA CTCAGCTCGT CGCCGTCACG AGTGATCACG GCTCGGAGCA TGGCCGAGCC 

^ 15841 CGTGGCGACG AACCGGGCCC CCTTCCAGGC GAACGGCAGA CCCGCAGCGC TGTCGTCCGG 

U 15901 CGTGGTGAGG GCGACGGCGT GCAGGGCCGC GTCGAGCAGC GCCGGATGCA CACCGAAACC 

93 30 15961 GTCCGCCTCG GCGGCCTGCT CGTCGGGCAG CGCCACCTCG G C AT AC AC GG TGTCACCATC 
ft! 16021 ACGCCAGGCA GCCCGCAACC CCTGGAACGC CGACCCGTAC TCATAACCGG CATCCCGCAG 

SJ 16081 TTCGTCATAG AACCCCGAGA CGTCGACGGC CACGGCCGTG ACCGGCGGCC ACTGCGAGAA 

p 16141 CGGCTCCACA CCGACAACAC CGGGGGTGTC GGGGGTGTCG GGGGTCAGGG TGCCGCTGGC 

16201 GTGCCGGGTC CAGCTGCCCG TGCCCTCGGT ACGCGCGTGG ACGGTCACCG GCCGCCGTCC 
35 16261 GGCCTCATCA GCCCCTTCCA CGGTCACCGA CACATCCACC GCTGCGGTCA CCGGCACCAC 
16321 AAGGGGGGAT TCGATGACCA GCTCGTCCAC TATCCCGCAA CCGGTCTCGT CACCGGCCCG 
16381 GATGACCAGC TCCACAAACG CCGTACCCGG CAGCAGGACC GTGCCCCGCA CCGCGTGATC 
16441 AGCCAGCCAG GGGTGAGTGC GCAATGAGAT CCGGCCAGTG AGAACAACAC CACCATCGTC 
16501 GGCGGGCAGC GCTGTGACAG CGGCCAGCAT CGGATGCGCC GCACCCGTCA ACCCCGCCGC 
40 16561 CGACAGATCG GTGGCACCGG CCGCCTCCAG CCAGTACCGC CTGTGCTCGA ACGCGTACGT 
16621 GGGCAGATCC AGCAGCCGTC CCGGCACCGG TTCGACCACC GTGTCCCAGT CCACTGCCGT 
16681 GCCCAGGGTC CACGCCTGCG CCAACGCCGT CAGCCACCGC TCCCAGCCGC CGTCACCGGT 
16741 CCGCAACGAC GCCACCGTGT GAGCCTGCTC CATCGCCGGC AGCAGCACCG GATGGGCACT 
16801 GCACTCCACG AACACCGACC CATCCAGCTC CGCCACCGCC GCGTCCAACG CCACCGGACG 
45 168 61 ACGCAGATTC CGGTACCAGT ACCCCTCATC CACCGGCTCC GTCACCCAGG CGCTGTCCAC 
16921 GGTCGACCAC CACGCCACCG ACGCGGCCTT CCCTGCCACC CCCTCCAGTA CCTTGGCCAG 
16981 TTCATCCTCG ATGGCTTCCA CGTGGGGCGT GTGGGAGGCG TAGTCGACCG CGATACGACG 
17041 CACCCGCACG CCTTCGGCCT CATACCGCGC CACCACCTCC TCCACCGCCG ACGGGTCCCC 
17101 CGCCACCACC GTCGAAGCCG GGCCGTTACG CGCCGCGATC CACACACCCT CGACCAGACC 
50 17161 GACCTCACCG GCCGGCAACG CCACCGAAGC CATCGCTCCC CGCCCGGCCA GTCGCGCCGC 
17221 GATGACCTGA CTGCGCAATG CCACCACGCG GGCGGCGTCC TCGAGGCTGA GGGCTCCGGC 
17281 CACGCACGCC GCCGCGATCT CGCCCTGGGA GTGTCCGATC ACCGCGTCCG GCACGACCCC 
17341 ATGCGCCTGC CACAGCGCGG CCAGGCTCAC CGCGACCGCC CAGCTGGCCG GCTGGACCAC 
174 01 CTCCACCCGC TCCGCCACAT CCGGCCGCGC CAACATCTCC CGCACATCCC AGCCCGTGTG 
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174 61 CGGCAGCAAC GCCTGAGCGC ACTCCTCCAT ACGCGCGGCG AACACCGCGG AGTGGGCCAT 
17521 GAGTTCCACG CCCATGCCGA CCCACTGGGC GCCCTGGCCG GGGAAGACGA ACACCGTACG 
17 581 CGGCTGGTCC ACCGCCACAC CCGTCACCCG GGCATCGCCC AGCAGCACCG CACGGTGACC 
17 641 GAAGACAGCA CGCTCCCGCA CCAACCCCTG CGCGACCGCG GCCACATCCA CACCACCCCC 
5 17701 GCGCAGATAC CCCTCCAGCC GCTCCACCTG CCCCCGCAGA CTCACCTCAC CACGAGCCGA 
177 61 CACCGGCAAC GGCACCAACC CGTCAACAAC CGACTCCCCA CGCGACGGCC CAGGAACACC 
17821 CTCAAGGATC ACGTGCGCGT TCGTACCGCT CACCCCGAAC G AC G AC AC AC CCGCATGCGG 
17 881 TGCCCGATCC GACTCGGGCC ACGGCCTCGC CTCGGTGAGC AGCTCCACCG CACCGGCCGA 
17941 CCAGTCCACA TGCGACGACG GCTCGTCCAC ATGCAGCGTC TTCGGCGCGA TCCCGTACCG 
10 18001 CATCGCCATG ACCATCTTGA TCACACCGGC GACACCCGCC GCCGCCTGCG CATGACCGAT 
18061 GTTCGACTTC AACGAACCCA GCAGCAGCGG AACCTCACGC TCCTGCCCGT ACGTCGCCAG 
18121 AATGGCCTGC GCCTCGATGG GATCGCCCAG CGTCGTCCCC GTCCCGTGCG CCTCCACCAC 
18181 GTCCACATCG GCGGCGCGCA GTCCGGCGTT CACCAACGCC TGCTGGATGA CACGCTGCTG 
18241 GGACGGGCCG TTGGGGGCGG ACAGCCCGTT GGAGGCACCG TCCTGGTTCA CCGCCGACCC 
15 18301 GCGGACGACC GCGAGAACGG TGTGTCCGTT GCGCTCGGCG TCGGAGAGCC GCTCCAGCAC 
18361 AAGAACGCCG GCGCCCTCCG CCCAGCCGGT GCCGTTGGCG GCGTCCGCGA ACGCGCGGCA 
18421 GCGGCCGTCG GGGGAGAGTC CGCCCTGCTG CTGGAATTCC ACGAACCCGG TCGGGGTCGC 
^ 184 81 CATGACGGTG ACACCGCCGA CCAGCGCCAG CGAGCACTCC CCGTGGCGCA GTGCGTGCCC 

□ 18541 GGCCTGGTGC AGCGCGACCA GCGACGACGA GCACGCCGTG TCCACCGTGA ACGCCGGTCC 

20 18 601 CTGGAGCCCA TAGAAGTACG AGATCCGGCC GGTGAGCACG CTGGGCTGCA TGCCGATCGA 
ijg 18 661 GCCGAACCCG TCCAGGTCCG CGCCGACGCC GTACCCGTAC GAGAAGGCGC CCATGAACAC 

jg 18721 GCCGGTGTCG CTGCCGCGCA GTGTGCCCGG CACGATGCCC GCGCTCTCGA ACGCCTCCCA 

q 18781 TGTCGTTTCC AGCAGGATCC GCTGCTGGGG GTCCATGGCC CGTGCCTCAC GGGGGCTGAT 

!~ 18841 GCCGAAGAAG GCGGCATCGA AGCCGGCGGC GTCGGAGAGG AAGCCGCCGC GGTCCGTGTC 

25 .18901 CGATCCGCCG GTGAGGCCGG ACGGGTCCCA GCCACGGTCG GCCGGGAAGC CGGTGACCGC 
^ 18961 GTCGCCGCCA CTGTCCACCA TGCGCCACAG GTCGTCGGGC GAGGTGACGC CGCCCGGCAG 

9 1 19021 TCGGCAGGCC ATGCCCACGA TGGCCAGCGG TTCGTCACGG GTCGCGGCGG CTGTGGGAAC 

s 19081 AGCGACCGGT GCGGCACCAC CGACCAGAGC CTCGTCCAAC CGCGACGCGA TGGCCCGCGG 

O 19141 CGTCGGGTAG TCGAAGACAA GCGTGGCGGG CAGTCGGACA CCGGTCGCCG CGGCGAGTCG 

03 30 19201 GTTCCGCAGT TCGACGGCGG TCAGCGAGTC GATACCCAGT TCCTTGAAGG CCGCGTCCGC 
19261 GGACACGTCC GCGGCGTCCG CGTGGCCGAG CACCGCCGCC GCGTTGTCGC GGACCAGTGC 
%A 19321 CAGCAGCGCG GTGTCCCGCT CAGCGCCGGA CATGGTGCCG AGCCGGTCGG CGAGCGGAAC 

q 19381 GGCGGTGGCC GCCGCCGGGC GCGATACGGC GCGGCGCAGA TCGGCGAAAA GCGGCGATGT 

JT 19441 GTGCGCGGTG AGGTCCATCG TGGCCGCCAC GGCGAACGCG GTGCCGGTTC CGGCCGCGGC 

35 19501 TTCCAGCAGG CGCATGCCCA CACCGGCCGA CATGGGGCGG AAACCGCCGC GGCGGACACG 
19561 GGTGCGGTTG GTGCCGCTCA TGCTGCCGGT GAGTCCGCTG TCATCGGCCC AGAGGCCCCA 
19621 GGCCAGCGAC AGCGCGGGCA GTCCTTCGGC ATGGCGCAGC GTCGCGAGTC CGTCGAGGAA 
19681 CCCGTTCGCC GCCGAGTAGT TGCCCTGGCC GCGGCCGCCC ATGATGCCCG CGACGGACGA 
19741 GTAGAGGACG AACGAGCGCA GGTCCGCGTC CCGGGTCAGC TCGTGCAGGT GCCAGGCGCC 
40 19801 GTCGGCTTTG GGGCGCAGTG TGGTGGCGAG CCGCTCCGGG GTGAGTGCCG TGGTCACGCC 
19861 GTCGTCGAGC ACGGCTGCCG TGTGGAAGAC CGCCGTGAGC GGCCTGCCGG CGGCGGCGAG 
19921 CGCGGCGGCG AGCTGGTCCC GGTCGGCGAC GTCACAGCGG ATGTGGACAC CGGGAGTGTC 
19981 CGCCGGCGGT TCGCTGCGCG ACAGCAACAG GAGGTGGCGG GCGCCATGCT CGGCGACGAG 
20041 ATGCCGGGCG AGGAGACCTG CCAGCACACC CGAGCCGCCG GTGATGACCA CCGTGCCGTC 
45 20101 CGGGTCGAGC AGCGGTTCGG GCGTTTCCGC GGCGGCCGTG CGGGTGAACC GCGGCGCTTC 
20161 GTACCGGCCG TCGGTGACGC GGACGTACGG CTCGGCCAGT GTCGTGGCGG CGGCCAGCGC 
20221 CTCGATGGGG GTGTCGGTGC CGGTCTCCAC CAGCACGAAC CGGCCCGGGT GCTCGGCCTG 
20281 GGCGGACCGG ACGAGGCCGG CGACCGCTCC TCCGACCGGT CCCGCGTCGA TCCGGACGAC 
20341 GAGGGTGGTC TCCGCAGGGC CGTCCTCGGC GATCACCCGG TGCAGCTCGC CGAGCACGAA 
50 204 01 CTCGGTGAGC CGGTACGTCT CGTCGAGGAC ATCCGCGCCC GGTTCCGGGA GCGCGGAGAC 
204 61 GATGTGGACC GCGTCCGCAG GACCGGGCCC GGGAGTGGGC AGCTCGGTCC AGGAGAGGCC 
20521 GTACAAGGAG TTCCGTACGA CGGCGGCGTC GCCGTCGACG TTCACCGGTC GCGCGGTCAG 
20581 CGCGGCGACG GTCACCACCG GTTGGCCGAC CGGGTCCGTC GCATGCACGG CAGCGCCGTC 
20641 CGGGCCCTGA GTGATCGTGA CGCGCAGCGT GGTGGCCCCG GTCGTGTGGA ACCGCACGCC 
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20701 


GCTCCACGAG 


AAC GG C AG CC 






20761 


GACGTGCAAG 


GCCGCGTCGA 






20821 


CTGTTCCCCG 


GCGATCTCCA 






20881 


CAGTCCCTGG 


AACGCTGGGC 




c 

J 


2094 1 


GCTCACGTCG 


ACGCGTCGCG 






■ 21001 


GCTTCCGGCC 


CGGCCGAGGG 






21061 


ACGCGCGTGG 


ACGGTCACTC 






21121 


CACATCCACC 


GCGCCGGTCA 






21181 


CACCCCGCAA 


CCGGTCTCGT 




1U 


21241 


CAGCAGAACC 


GTGCCCCGCA 






21301 


CCGGCCAGTG 


AGAACAACAC 






21361 


CATCGGATGC 


GCCGCCCCGG 






21421 


CAGCCAGTAC 


CGCCTGTGCT 






21481 


CGGTTCGACC 


ACCGTGTCCC 




1 C 

1 J 


21541 


CGTCAGCCAC 


CGCTCCCAGC 






21601 


TTCCATCGCC 


GGCAGCAGCA 






21661 


CTCCGCCACC 


GCCGCGTCCA 


a 




21721 


ATCCACCGGC 


TCGGTCACCC 




21781 


CCCGCCGGAA 


ATCCCCTCCA 


■ ft 




21841 


CGTGTGGGAG 


GCGTAGTCGA 






o t ft ft i 

21 901 


CGTCACCACT 


TCTTCCACCG 






21961 


ACGCGCCGCG 


ATCCACACGC 






22021 


TV /^"l y*^ TV m y^l y^i y*^ 

AGCCATCGCC 


CCCCGCCCGG 


: ( j 




22081 


GCGGGCGGCG 


. TCCTCAAGGC 


M,- 


25 


22141 


GGAGTGTCCG 


ACCACCGCGT 




22201 


CACCGCGACC 


GCCCAGCTGG 


01 




22261 


CGCCAACATC 


TCCCGCACAT 


n 




22321 


CATACGAGCC 


GCGAACACCG 




22381 


AGCACCCTGC 


CCGGGAAAGA 




1 A 

3U 


22441 


CCGGGCATCG 


CCCAACAACA 


ru 




22501 


CTGCGCGACC 


GCGGCCACAT 


SI 




22561 


CTGCCCCCGC 


AGACTCACCT 




22621 


AGCCGACTCC 


CCACGCGACG 




Jo 


22681 


GCTCACCCCG 


AAAGCGGAGA 




22741 


CGCCTCGGTG 


AGCAGTTCCA 






22801 


CACATGCAGC 


GTCTTCGGCG 






228 61 


GGCGACACCC 


GCAGCCGCCT 






22 921 


CGGAACCTCA 


CGCTCCTGCC 






22981 


CAGCGTCGTC 


CCCGTCCCGT 




40 


23041 


CTTGTGGAGG 


GCCTGGCGGA 






23101 


GTTGGAGGCG 


CCGTCCTGGT 






23161 


GTTGCGCTCG 


GCGTCGGAGA 






23221 


GGTGCCGTCC 


GCCGCGTCAG 






23281 


CCGGGAGAAC 


TCCACGAAGG 






2334 1 


CAGCGAGCAC 


TCCCCGGTCC 






23401 


CGAACACGCC 


GTGTCGACCG 






23461 


TCCGGCGAGC 


ACCGCGGGCT 






23521 


GCCGTAGCCG 


TAGTAGAAGC 




50 


23581 


CGGCACGATG 


CCGGCGTGTT 




23641 


CGGGTCGAGT 


GCGGTGGCCT 






23701 


GGCGCCCGCG 


AGTGCGCCGG 






23761 


CACGTCCCAG 


CCGCGGTCGG 






23821 


CTGCCACAGC 


TCTTCCGGTG 






23881 


GGCGAGCGGC 


TCGTTCGCCG 
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GCACCTCCGC TTCCTGTTCC GCGAGCAGCG GCAGGCAGGT 
ACAGCGCCGG GTGGACGCCA TAGTGCGGCG TGTCGTCCGC 
CCTCGGCGTA CAGGGTTTCG CCGTCGCGCC AGGCGGTGCG 
CGTAGCTGTA GCCGGTCTCG GCCAGCCGCT CGTAGAACGC 
CGCCCGGCGG CGGCCACGCG GGCGGCGGGA CCGCCGCGAC 
TGCCGCTGGC GTGCCGGGTC CAGCTGTCCG TGCCCTCGGT 
GCCGCCGTCC GGCCTCATCG' GCCCCTTCGA CGGTCACCGA 
CCGGCACCAC GAGCGGGGTC TCGATGACCA GTTCATCCAC 
CACCGGCCCG GATGACCAGC TCCACAAACG CCGTACCCGG 
CCGCGTGATC AGCCAGCCAG GGATGCGTAC GCAACGAGAT 
CACCACCGTC GTCGGCGGGC AGTGCTGTGA CGGCGGCCAG 
TCAGCCCGGC CGCGGACAGA TCGGTGGCAC CGGCCGCCTC 
CGAACGCGTA GGTGGGCAGA TCGAGCAGCC GTCCCGGCAC 
AGTCCACTGC CGTGCCCAGG GTCCACGCCT GCGCCAACGC 
CGCCGTCACC GGTCCGCAAC GACGCCACCG TGTGAGCCTG 
CCGGATGGGC GCTGCACTCC ACGAACACGG ACCCGTCCAG 
GCGCGACGGG GCGACGCAGG TTCCGGTACC AGTAGCCCTC 
AGGCGCTGTC CACCGTGGAC CACCAGGCCA CCGACCCGGT 
GTACCTCGGC CAACTCGTCC TCGATGGCTT CCACGTGGGG 
CCGCGATACG GCGCACTCGC ACGCCTTCGG CCTCGTACCG 
CGGACGGGTC CCCCGCCACC ACAGTCGAAG ACGGGCCGTT 
CCTCGACCAG GTCCACCTCA CCGGCCGGCA ACGCCACCGA 
CCAGCCGCCC GGCGATCACC TGGCTGCGCA AGGCCACCAC 
TGAGGGCTCC GGC.CACACAC . GCCGCCGCGA TCTCGCCCTG 
CCGGCACGAC CCCATGCGCC TGCGACAGCG CGGCCAGGCT 
CCGGCTGGAC CACCTCCACC CGCTCCGCCA CATCCGGCCG 
CCCAGCCCGT GTGCGGCAAC AACGCCCGCG CACACTCCTC 
CAGAACACGC CATCAACTCC ACACCCATGC CCACCCACTG 
CGAACACCGT ACGCGGCTGA TCCACCGCCA CACCCATCAC 
CCGCACGGTG ACCGAAGACA GCACGCTCAC GCACCAACCC 
CCACACCACC CCCGCGCAGA TACCCCTCCA GCCGCTCCAC 
CACTCCGAGC CGACACCGGC \AACGGCACCA ACCCATCGAC 
GCCCGGGAAC ACCCTCAAGG ATCACGTGCG CGTTCGTACC 
CACCGGCCCG GCGCGGACGT CCCGCGTCGG GCCACGCCCG 
CCGCGCCCTC GGTCCAGTCC ACATGCGACG ACGGCTCGTC 
CGATGCCATA CCGCATCGCC ATGACCATCT TGATGACACC 
GCGCATGACC GATGTTCGAC TTCAACGAAC CCAGCAGCAG 
CGTACGTCGC CAGAATCGCG TGCGCCTCGA TGGGATCGCC 
GCGCCTCCAC CACGTCCACG TCGGCGGGGG CGAGCCCCGC 
TGACGCGCTG CTGGGAGGGG CCGTTGGGTG CGGAGATGCC 
TGACGGCGGA GGAGCGGACG ACCGCGAGGA CGGTGTGTCC 
GCTTTTCGAC GACGAGGACG CCGGCCCCCT CGGCGAAACC 
CGAACGCCTT GCACCGTCCG TCCGGCGCGA CGCCGCCCTG 
TCTGTGGTGA TGCCATCACT GTGACACCAC CGACCAGCGC 
GCAGCGCCTG CCCGGCCTGG TGCAGCGCGA CCAGCGACGA 
TGACCGCCGG ACCCTCCATG CCGAAGAAGT ACGACAGCCG 
GTGTGCTGTA GGCGCCGAAT CCGCCCAGGT CCGCGCCCGT 
CGCCGACGAA GACGCCGGTG TCGCTGCCGC GCAGGGTGTC 
CGAGCGCCTC CCAGGCGATT TCGAGGAGGA TCCGCTGCTG 
CGCGCGGACT GATGCCGAAG AACGCGGCAT CGAAGTCGGC 
CCCGCCCGGT GGCGGACTCG GCGGCGGCGT GCAGCGCGGC 
TGGGGAAGTC GCCGATCGCG TCGCGGCCGT CCGCGACGAG 
AGGTGACGCC GCCCGGCAGT CGGCAGGCCA TGCCGACGAC 
CGGCGCGCAG CGCGGTGTTC TCCCGGCGGA GCTGCGCGTT 
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23941 GTCCTTGACC GACGTCCGCA GCGCCTCGAT CAGGTCGTTC TCGGCCATCG CCTCATCCCT 
24 001 TCAGCACGTG CGCGATGAGC GCGTCTGCGT CCATGTCGTC GAACAGTTCG TCGTCCGGCT 
24 061 CCGCGGTCGT GGTGCTCGCG GGTGCCTGTG CCGGTGGTTC ACCGCCGTCC GGGGTCCCGT 
24121 TGTCGTCCGG GGTCCCGTTG ACGTCCGGGG CCAGGAGGGT CAGCAGATGA CGGGTGAGCG 
5 24181 CGCCGGCGGC GGGATAGTCG AAGACGAGCG TGGCCGGCAG CGGAATGCCG AGGGCCTCGG 
24241 AGAGCCGGTT GCGCAGGCCG AGCGCGGTGA GCGAGTCGAC CCCGAGGTCC TTGAACGCCG 
24 301 TGGTGGCCGT GACCGCCGCC GCGTCGGTGT GGCCCAGCAG . GGTGGCGGCG GTGTCGCGGA 
24 361 CGACGCCGAG CAGCACCTGT TCCCGTTCCT TGTGGGGCAG GTCCGGCAGG CGTTCCAGCA 
24 421 GGGAGCCGCC GTCGGTCGCG GAGCGCCGGG TGGGGCGCTG GATCGGTCGC CACAGCGGTG 
10 24 481 ACGGGTCGCC GGGCCCGGGT GGGGCGGTCG CCACGACCAC GGCTTCCCCG GTGGCGCACG 
24 541 CGGCGTCGAG GAGGTCGGTC AGCCGGTCCG CCGCGGCGGT GAACGCCACG GCCGGCAGGC 
24 601 CTTGTGCCCG GCGCAGGTCG GCCAGGGCCT GGAGCGGTCC GGCCGCCTCG CCGGACGGAA 
24 661 CGGCGAGAAC GAACGCGGTC AGGTCGAGGT " CGCGGGTCAG GCGGTGCAGT TCCCAGGCCG 
24 721 ACTCGGCGGT GCCGTCCGCG TGGACGACCG CGGTCACCGG GGTTTCCGGC ACTGTGCCCG 
15 24 781 GCTCGTACCG GATCACTTCG GCGCCGTGTC CGCCGAGGTG TCCGGCGAGT TCCTCCGAAC 
24 841 CGCCCGCGAG GAGGACGGTG TCGCCGTACG AGGCCGCGGC CGTGGTGGGC GCGGCGGGGA 
24 901 CGAGGCGGGG CGCTTCGAGG CGCCCGTCGG CCAGGCGCAG GTGCGGTTCG TCGAGGCGGG 
24 961 AGAGGGCGGC GGCGCGGCGG GGGGTGACCG TGTCGGTGGT CTCCACGAGC ACGAGCCGGC 
O 25021 "CCGGTTCCGC GGTGTCGAGC AGTGCGGCGA CGGCACCGGC GACGGGCCCG GCCTCGGCGG 

yB 20 25081 ACACCACCAG CGTGGCGCCG GCGGTCCTCG GGTCGTCCAG TGCGGTACGG ACCTCGTCGG 



*j3 25141 GACCGGATAC CGGGACGACG ATGACGTCGG GCGTGGCGTC GTCGCCGAGG TCGGTGTACC 

«p 25201 GGCGGGCCGT GGTGCCGGGT GCCGCCGGGG CCCGGACGCC GGTCCAGGTG CGCCGGAACA 

p 25261 GCCGCACGTC CCCGTCCGGG CCCGTCGTGG CGGGGGGCCG GGTGATGAGC GAGCCGATCT 

hi 25321 GAGCCACCGG CCGTCCCAGT TCGTCGGCGA GGTGCAGGCG GGCGCCGCCC -TCGCCCTCGC 



f7 25 25381 CGTGGACGAA GGTGACGCGC AGTTTCGTGG CGCCGCTGGT GTGGACACGG ACGCCGGTGA 

jj, 254 41 ACGCGAACGG CAACCGTACC CCCGCGTTCT CGGCGGCCGC GCCGATGCTG CCCGC'TTGCA 

25501 GCGCGGTGAC GAGCAGCGCC GGGTGCAGTG TGTAGCGGGC GGCGTCCCTG GCGAGGGCGC 

* 25561 CGTCGAGGGC GACTTCGGCG CAGACGGTGT CTCCGTGGCT CCACGCGGCG GACATGCCGC 

O 25621 GGAACTCGGG GCCGAACTCG TATCCCGCGT CGTCGAGTCG CTGGTAGAAG GCCGCGACGT 

ffl 30 25681 CGACCGGTTC CGCGTGCTCG GGCGGCCAGG GCCCCGGCGT GGTGGCCGGT TCGGTGGTGG 



ft] 25741 CGATGCCGGC GAAGCCGGAG GCGTGGCGGG TCCATGTCCG GTCGCCGTCC GTCCGGGCGT 

SJ 25801 GGACGCGCAC GGCACGGCGT CCGGTGTCGT CGGGCGCGGC GACGGTCACG CGCACCTGGA 

p 25861 CGGCGCCGGT GGCGGGCAGG ACCAGCGGTG TCTCGACGAC CAGTTCGTCG AGCAGGTCGC 

y, 25921 AGCCTGCCTC GTCGGCGCCG CGTCCGGCCA ATTCCAGGAA GGCGGGTCCG GGCAGCAGTA 



35 25981 CGGCGCCGTC GACGGAGTGA CCGGCCAGCC ATGGGTGGGT GGCCAGCGAG AACCGGCCGG 
26041 TGAGCAGCAC CTCGTCGGAG TCGGGGAGCG CCACCGACGC GGCGAGCAGC GGGTGGTCGA 
2 6101 CGGCGTCGAG TCCGAGGCCG GAAGCGTCCG TGCCGGCCGC GGTCTCGATC CAGTAGCGCT 
2 6161 CATGGTGGAA GGCGTATGTG GGCAGGTCGT GTGCCGTCGC CGTCGCGGGG ACGACCGCCG 
2 6221 CCCAGTCGAC GGGCACGCCG GTTGTGTGCG CCTCGGCCAG CGCGGTGAGC AGCCGGTGGA 
40 2 6281 CTCCCCCGCC GCGGCGGAGC GTGGCGACGG TCGCGCCGTC GATCGCGGGC AGCAGCACGG 
26341 GGTGCGCGCT GACCTCGACG AACACGGTGT CACCCGGCTC GCGGGCAGCG GTCACGGCCG 
26401 TGGCGAAGCC TACGGGGTGG CGCATGTTGC GGAACCAGTA CTCGTCGTCG AGCGGCGCGT 
264 61 CGATCCAGCG TTCGTCGGCG GTGGAGAACC ACGGGATCTC GGGCGTGCGC GAGGTGGTGT 
26521 CCGCGACGAT CCGCTGGAGT TCGTCGTACA GCGGGTCGAC GAACGGGGTG TGGGTCGGGC 
45 26581 AGTCGACGGC GATGCGGCGC ACCCAGACGC CGCGGGCCTC GTAGTCGGCG ATCAGCGTTT 
2 6641 CGACGGCGTC CGGGCGCCCG GCGACGGTCG TGGTGGTGGC GCCGTTGCGG CCCGCGACCC 
2 6701 AGACGCCGTC GATCCGGGCG GCATCCGCCT CGACGTCGGC GGCCGGGAGC GCGACCGAGC 
26761 CCATCGCGCC GCGTCCGGCG AGTTCGCGCA G GAG C AG GAG AACGCTGCGC AGCGCGACGA 
2 6821 GGCGGGCACC GTCCTCCAGG GTGAGCGCTC CGGCGACACA GGCCGCGGCG ATCTCGCCCT 
50 2 6881 GGGAGTGTCC GATGACGGCG TCCGGGCGTA CGCCCGCGGC CTCCCACACG GCGGCCAGCG 
2 6941 ACACCATGAC GGCCCAGCAG ACGGGGTGCA CGACGTCGAC GCGGCGGGTC ACCTCCGGGT 
27001 CGTCGAGCAT GGCGATGGGG TCCCAGCCCG TGTGCGGGAT CAGCGCGTCG GCGCATTGGC 
27061 GCATCCTGGC GGCGAACACC GGGGAGGCCG CCATCAGTTC GACGCCCATG CCGCGCCACT 
27121 GCGGTCCTTG TCCGGGGAAG ACGAAGACGG TGCGCGGCTC GGTGAGCGCC GTGCCGGTGA 
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27181 CGACGTCGTC GTCGAGCAGC ACGGCGCGGT GCGGGAACGT CGTACGCCTG GCGAGCAGGC 
27241 CCGCGGCGAT GGCGCGCGGG TCGTGGCCGG GACGGGCGGC GAGGTGCTCG CGGAGTCGGC 
27301 GGACCTGGCC GTCGAGGGCC GTGGCGGTCC GCGCCGAGAC GGGCAGTGGT GTGAGCGGCG 
27361 TGGCGATCAG CGGCTCACCG GGCTTCGAGG CCGACGGCTC CTCGGCCGGC GGCTCCCCGG 
5 274 21 CCGGGTGGGC TTCCAGCAGG ACGTGGGCGT TGGTGCCGCT GACGCCGAAG GAGGACACAC 
274 81 CGGCGCGCCG CGGGCGGTCG GTCTCGGGCC AGGGCCGGGC ATCGGTGAGG AGTTCGACGG 
27541 CGCCGGCCGT CCAGTCGACG TGCGAGGACG GCGTGTCCAC GTGCAGGGTG CGCGGCAGGG 
27 601 TGCCGTGCCG CATGGCGAGG ACCATCTTGA TGACACCGGC GACACCCGCG GCGGCCTGAG 
27 661 TGTGGCCGAT GTTGGACTTC AGCGAGCCCA GCAGCACCGG GGTGTCGCGC CCCTGCCCGT 
10 27721 AGGTGGCCAG CACCGCCTGT GCCTCGATGG GATCGCCCAG CCTGGTGCCG GTGCCGTGCG 
27781 CCTCCACGGC GTCCACGTCC GCCGGGGTGA GCCCGGCGTT GGCCAGGGCC TGCCGGATCA 
27841 CCCGCTCCTG CGAGGGCCCG TTCGGCGCCG ACAACCCGTT GGAAGCACCG TCCTGGTTGA 
27 901 CCGCCGAACC CCGGACAACC GCCAGCACAC GGTGGCCGTT GCGCTCGGCA TCGGAGAGCC 
27 961 TCTCGACGAT CAGCACACCG GACCCCTCGG CGAAACCGGT GCCGTCAGCC GCATCCGCGA 
15 28021 ACGCCTTGCA GCGCGCGTCG GGCGCGAGAC CCCGCTGCTG GGAGAACTCG ACGAAGCCGG 
28081 ACGGCGAGGC CATCACCGTG ACGCCGCCGA CCAGGGCGAG CGAGCATTCG CCGGAGCGCA 
28141 GTGACTGCCC GGCCTGGTGC AGCGCCACCA GCGACGACGA ACACGCCGTG TCGACCGTGA 
^ 28201 CCGCCGGACC CTCCAGACCG TAGAAGTACG ACAGCCGACC GGACAGCACA CTGGTCTGGG 

Q 282 61 TGCCGGTCGC GCCGAAACCG CCCAGGTCGG TGCCGAGTCC GTACCCGTCG GAGAAGGCGC 

yEj 20 28321 CCATGAACAC GCCGGTGTCG CTTCCGCGCA GCGACTCCGG GAGGATCCCG GCGTGTTCCA 
yg 28381 GCGCCTCCCA CGAGGTCTCC AG G AC C AG AC GCTGCTGCGG GTCCATCGCC AGCGCCTCAC 

£ 284 41 GCGGACTGAT CCCGAAGAAC GCCGCGTCGA AGTCCGCCAC CCCGGCGAGG AAGCCACCAT 

q 28501 GACGCACGGT CGACGTGCCC GGATGATCCG GATCGGGATC GTACAGCCCG TCCACGTCCC 

yj. . 28561 AACCACGGTC CGTCGGAAAC GCCGTGATCC CGTCACCACC CGACTCCAGC AGCCGCCACA 

^ 25 28 621 AGTCCTCCGG CGACGCGACC ' CCACCCGGCA GCCGGCAGGC CATCCCCACG ATCGCCAACG 
!!! 28 681 GCTCGTCCTG CCGGACGGCC GCGGTCGTGG TGCGGGTCGG CGATGCCGTC CGGCCGGACA 

9 1 28741 GCGCCGCGGT GAGCTTCGCC GCGACGGCGC GCGGCGTCGG GAAGTCGAAG ACCGCGGTGG 

2 28801 CGGGCAGCCG TACGCCCGTC GCCTCGGTGA AGGCGTTGCG CAGCCGGATC GCCATGAGCG 

O 288 61 AGTCGACGCC GAGTTCCTTG AACGTGGCGG TCGCCTCGAC CCGTGCGGCA CCGTCGTGGC 

93 30 28 921 CGAGTACGGC CGCGGTGCAC TGCCGGACGA CGGCGAGCAC GTCCTTTTCG GCGTCCGCGG 
flj 28 981 CGGAGAGCCG CGCGATCCGG TCGGCGAGGG TGGTGGCGCC GGCCGCCCGG CGCCGCGG.CT 

SJ 29041 CCCGGCGCGG TGCGCGCAGC AGGGGCGAGC TGCCGAGGCC GGCCGGGTCG GCGGCGACCA 

g 2 9101 GCGCCGGGTC CGAGGACCGC AACGCCGCGT CGAACAGCGT CAGTCCGCCT TCGGCGGTCA 

j*[ 29161 GCGCCGTCAC GCCGTCGCGG CGCATGCGGG CGCCGGTGCC GACCGTCAGC CCGCTCTCCG 

35 29221 GTTCCCACAG GCCCCAGGCC ACGGACAACG CGGGCAGTCC GGCTGCCCGG CGCTGTTCGG 
29281 CCAGCGCGTC GAGGAACGCG TTCGCGGCCG CGTAGTTGCC CTGTCCGGGG CTGCCGAGCA 
29341 CACCGGCGGC CGACGAGTAG AGGACGAACG CGGCCAGTTC CGTGTCCTGG GTGAGTTCGT 
29401 GCAGGTGCCA CGCGGCGTCC ACCTTCGGGC GCAGCACCGT CTCGAGCCGG TCGGGGGTGA 
294 61 GCGCGGTGAG GACGCCGTCG TCGAGGACGG CCGCGGTGTG CACGACGGCC GTGAGCGGGT 
40 2 9521 GCGCCGGGTC GATCCCCGCC AGTACGGAGG CGAGTTCGTC CCGGTCGGCG ACGTCGCAGG 
29581 CGATCGCCGT GACCTCGGCG CCGGGCACGT CGCTCGCCGT GCCGCTGCGC GACAGCATCA 
29641 GCAGCCGGCG CACGCCGTGG CGTTCGACGA GGTGGCGGCT GATGATGCCG GCCAGCGTCC 
2 9701 CGGAGCCACC GGTGACGAGC ACGGTGCCGT CCGGGTCGAG CGCCGGAGCG TCACCCGCCG 
29761 GGACCGCCGG GGCCAGACGG CGGGCGTACA CCTGGCCGTC ACGCAGCACC ACCTGGGGCT 
45 29821 CATCGAGCGC GGTGGCCGCT GCGAGCAGCG GCTCGGCGGT GTCCGGGGCG GCGTCGACGA 
29881 GGACGATCCG GCCGGGGTGT TCGGCCTGCG CGGTCCGCAC CAGTCCGGCG GCCGCGGCCG 
2 9941 ACGCGAGACC GGGCCCGGTG TGGACGGCCA GGACCGCGTC GGCGTACCGG TCGTCGGTGA 
30001 GGAAGCGCTG CACGGCGGTC AGGACGCCGG CGCCCAGTTC GCGGGTGTCG TCGAGCGGGG 
30061 CACCGCCGCC GCCGTGCGCG GGGAGGATCA CCACGTCCGG GACCGTCGGG TCGTCGAGGC 
50 30121 GGCCGGTCGT CGCGGTCGTG GGCGGCAGCT CCGGGAGCTC GGCCAGCACC GGGCGCAGCA 
30181 GGCCCGGAAC GGCTCCCGTG ATCGTCAGGG GGCGCCTGCG CACGGCGCCG ATGGTGGCGA 
30241 CGGGCCCGCC GGTCTCGTCC GCGAGGTGTA CGCCGTCAGC GGTGACGGCG ACGCGTACCG 
30301 CCGTGGCGCC GGTGGCGTGG ACGCGGACGT CGTCGAACGC GTACGGAAGG TGGTCCCCTT 
30361 CCGCGGCGAG GCGGAGTGCG GCGCCGAGCA GCGCCGGGTC CAGGCCGTAC CGTCCGGCGT 
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30421 CGGCGAGCTG TCCGTCGGCG AGGGCCACTT 
30481 CGGCGCGCGG GCGGGGCAGC GCGGGCCCGT 
30541 CGATGTCGTC GGGGTCCACC GGCCGGGCCG 
30601 GCACGGCCGG GGCCGTCCGC GGGTCGGGGG 
30661 CCCCCGCCGC GTGCCGCGTG TGCACGGTGA 
30721 TCACCGTGAC GGAGAGCGCG AGCGCACCGG 
30781 TGAACGTGTC GAGGGCGCCG CAGCCGGCTT 
30841 GGGCCGCGGC GGGCAGCACC GCGAGGCCGT 
30901 CGACCCGGCC GGTGAGCACC AGGTCGCCGG 
30961 CCGGGTGCGC GACCGGCGTC TGTCCGGCCG 
31021 GCCAGTAGCG GACCCGCTCG AACGGGTACG 
31081 GGTCGATGAC CTTCGGCCAG TCGACCGTGA 
31141 TCAGGGCGGA TCGCGGTTCG TCGTCGGCGT 
31201 TCAGGCTCCG GTCCGGGCCG ATCTCCAGGA 
31261 CCCCGAACCG GACGGTGTCG CGGACCTGTC 
31321 CGCCCGCGGC CATCGGGATC CTCGGCTCGT 
31381 ACTCCTCGAG CATCGGCTCC ATCCGCGCCG 
314 41 TGAAGCGGCC GAGCCGGGCC GCGACGTCGA 
31501 TCGACGCGGG CCCGTTGACC GCGGCGATCT 
31561 CCCGTTCCGA CGCGATCACG GCGGCCATCG 
31621 GGGCCCGTGC GGACACCAGC CTGCACGCGT 
31681 CGGCGGCCAG CTCGCCGATC GAATGGCCCA 
31741 CGAGCTGTGC GCCGAGTGCG ACCTGGAGCG 
31801 CGTGGAGGTC GAGCCCGGCG. GGCACGTCGA 
318 61 CGAAGACGTC GTAGGCGGCG GCCAGTCCGT 
31921 CGGAGAAGAG CCACACGAGG CGGCGGTCCG 
31981 CGATCAGCGC GGCCCGGTGC GGGAAGGCCG 
32041 GCTCGTCCTC CTCGCCGGTG GCGAGGTGGG 
32101 CCTGCGGGGT GCGTGCCGAG AGCAGCAGGG 
32161 GTTCGGGGGC CGGTCGGGGG TGGCTTTCGA 
32221 CGAAGGAGGA CACCCCGGCG CGCCGTGGGC 
32281 TGAGGAGTTC GACGGCGCCG GCCGTCCAGT 
32341 GGGTGCGCGG CAGGGTGCCG TGCCGCATGG 
324 01 CCGCGGCGGC CTGAGTGTGG CCGATGTTGG 
324 61 CGCGATGCTG CCCGTAGGTG GCCAGTACCG 
32521 TCCCGGTGCC ATGCGGCTCG ACAGCGTCCA 
32581 GCGCCTGCCG GATCACCCGC TCCTGCGACG 
32 641 CACCGTCCTG GTTGACCGCC GAACCACGCA 
32701 CGGCGTCGGA GAGCCTCTCG ACGATCAGCA 
327 61 CAGCCGCATC CGCGAACGCC TTGCAGCGGC 
32821 AGTCCACGAA GCCGGACGGC GAGGCCATCA 
32881 ACTCCCCCGA GCGCAGCGAC TGCCCGGCCT 
32941 CCGTGTCCAC CGTGACCGCC GGACCCTCCA 
33001 GCACACTGGT CTGGGTGCTG GTGGCACCGA 
33061 CGTAGAAGTA GCCGCCCATG AACACGCCGG 
33121 TCCCGGCGTG TTCCAGCGCC TCCCACGAGG 
33181 TCGCCAGCGC CTCACGCGGA CTGATCCCGA 
332 41 CGAGGAAGCC ACCATGACGC ACGGTCGACG 
33301 GCCCGTCCAC GTCCCAACCA CGGTCCGTCG 
33361 CCAGCAGCCG CCACAAGTCC TCCGGCGACG 
334 21 CCACGATCGC CAACGGCTCG TCCTGCCGGA 
334 81 TGGCCCGCGC GCCGGCCAGT TCGTCCAGGT 
33541 CGAAGACGAG CGTAGCGGGC AGCGTCAGGC 
33601 CGACGCCGGT CAGCGAGTCG AAGCCCACTT 



CCGCCCAGAC GGCGTCGTCG TCGGCCCAGA 
CCGTGTACCC GGCTCGGGCC AGACGGTCGG 
TGGCGGGCGG CCACGTCGAC GGCATCTCCC 
CGAGGATTCC GTGCGCGTGC TCGGTCCACT 
CCGCGCGGCG GCCGTCCGCC CCGGGCGCGC 
ACCGCGGCAG CGTGAGGGGG GTGTCCACGG 
CGTCGCCCGC CCGGATCGCC AGATCCAGGA 
GCAGGGAGTG CGCCAGCGGA TCGGCGGCGT 
TGCCGGGCAG GGTGACCGCC GCGGTCAGCG 
GGGCCGCGTC GCCCGCGGTC TGGGTGCCGA 
TCGGCGGGTG CGAGGCGCGT GCCGGCGCGG 
CGCCGTCGGT GTGCAGCCGG GCGAGCGCGG 
GCAGCATCGG GATGCCGTCG ACGAGTCGGG 
GCACCGCCCC GTCGTGCGCG GCGACCTGTT 
GTACCCAGTA CTCCGGCGTG GTGCAGGCGG 
GGTACGTCAG GCTCTCCGCG ACCTTGCGGA 
AGTGGAACGC GTGGCTGGTC CGCAGGCGGG 
GCACCGCCTC CTCGTCACCG GAGAGCACGA 
CCACGCCGTC CCGCAGCAGC GGCAGCGCGT 
CCCCGCCGGA CGGCAGCGCC TGCATCAGGC 
CCTCCAGGGA CCAGACGCCG GCG ACGTACG 
CGAAGGCGTC CGGGCGTACG CCCCACGCCT 
CGAACACCGC GGGCTGGGCG TACCCGGTGT 
GGGCGTCCAG CACCTCGCGG CGAGTGCGGG 
CGCCCATGCC GGGACGTTGT GAGCCCTGTC 
GTTCTGCGGC GCCGGTGACC GTGTCGGTGC 
TGCGGGCGAG CAGGGCCGCG GCCACCGCGC 
CGCGCAGGCG GTGTACCTGT GCGTCGAGTG 
GCAGCGGTCC GGTGTCGGGT GCCGGGGCGG 
GGATGATGTG AGCGTTGGTG CCGCTAACGC 
GGTCGGTTTC GGGCCAGGGG CGGGCGTCGG 
CGACGTGCGA GGACGGCGTG TCCACGTGCA 
CGAGGACCAT CTTGATGACA CCGGCGACGC 
ACTTCAGCGA GCCCAGChGC ACCGGGGTGT 
CCTGCGCCTC GATGGGGTCG CCCAGCCTGG 
CATCCGCCGG GGTGAGCCCG GCGTTGGCCA 
GCCCGTTCGG CGCCGACAAC CCGTTGGAAG 
CGACCGCCAG GACATTGTGG CCGTGCCGCT 
CACCGGATCC CTCGGCGAAA CCGGTGCCAT 
CGTCCGGGGA GAGGCCCCGC TGCTGGGAGA 
CCGTGACGCC GCCGACCACG GCGAGCGAGC 
GGTGCAGCGC CACCAGCGAC GACGAACACG 
AACCGTAGAA GTACGACAGC CGACCGGACA 
AACCGCCGCG GTCGGCTCCA GTGCCGTACC 
TGTCGCTTCC GCGCAGCGAC TCCGGGAGGA 
TCTCCAGGAC CAGACGCTGC TGCGGGTCCA 
AGAACGCCGC GTCGAAGTCC GCCACCCCGG 
TGCCCGGATG ATCCGGATCG GGATCGTACA 
GAAACGCCGT GATCCCGTCA CCACCCGACT 
CGACCCCACC CGGCAGCCGG CAGGCCATCC 
CGGCCGCGGT CGGGGTACGC CGCCGGGTGG 
GGGCGGCGAG CGCCTGCGCC GTGGGGTGGT 
CCGTCGCGTC GGCCAGCCGG TTGCGCAGTT 
CCCTGAACGC GCGCGCGGGT GCGATGGCGT 
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33661 GGGCGTCGCG GTGGCCGAGC ACCGCGGCAG CGCTGGTACG GACGAGGTCG AGCATGTCGC 
33721 GCGCGGCCGG AGGTGCGGAC GTGCGCCGGA CGGCCGGCAC GAGGGTGCGT AGGACCGGCG 
33781 GGACCCGGTC GGACGCGGCG ACGGCGGCGA GGTCGAGCCG GATCGGCACG AGCGCGGGCC 
33841 GGTCGGTGTG CAGGGCCGCG TCGAACAGGG CGAGCCCCTG TGCGGCCGTC ATCGGGGTCA 
5 33901 TGCCGTTGCG GGCGATGCGG GCCAGGTCGG TGGCGGTCAG CCGCCCGCCC ATCCCGTCCG 
33961 CCGCGTCCCA CAGTCCCCAG GCGAGCGAGA CGGCGGGCAG CCCCTGGTGG TGCCGGTGGC 
34021 GGGCGAGCGC GTCGAGGAAC GCGTTGCCGG TCGCGTAGTT GGCCTGACCC GCGCCGCCGA 
34 081 ACGTGGCGGA TATGGACGAG TACAGGACGA ACGCGGCCAG GTCGAGATCG CGCGTCAGCT 
34141 CGTGCAGGTG CCAGGCGACG TCCGCCTTGA CCCGCAGCAC GGCGTCCCAC TGCTCCGGCC 
10 34 201 GCATGGTCGT CACGGCCGCG TCGTCGACGA TCCCGGCCAT GTGCACGACG GCGCGCAGCC 
34 261 GCTGGGCGAC GTCGGCGACG ACTGCGGCCA GCTCGTCGCG GTCGACGACG TCGGCGGCCA 
34 321 CGTACCGCAC GCGGTCGTCC TCCGGCGTGT CGCCGGGCCG GCCGTTGCGG GACACCACGA 
34 381 CGACCTCGGC GGCCTCGTGC ACGGTGAGCA GGTGGTCCAC GAGGAGGCGG CCGAGCCCGC 
344 41 CGGTGCCGCC GGTGACGAGG ACGGTCCCGC CGGTCAGCGG GGAGGTTCCG GTGGCCGCGG 
15 34 501 CGACACGGCG CAGACGGGCC GCACGCGCTG TGCCGTCGGC GACCCGGACG TGCGGCTCGT 
34 561 CGCCGGCGGC GAGCCCGGCC GCTATGGCGG CGGGCGTGAT CTCGTCCGCT TCGATCAGGG 
34 621 CGACGCGGCC GGGATGCTCC GTCTCCGCCG TCCGGACCAG GCCGCCGAGC GCTTCCTGCG 
34 681 CGGGATCGCC GGTACGGGTG GCCACGATGA GCCGGGATCG CGCCCAGCGC GGCTCGGCGA 
D 347 41 GCCAGGTCTG CACGGTGGTG AGCAGGTCGC GGCCCAGCTC CCGGGTCCGG GCGCCGGGCG 

y3 20 34 801 AGGTGCCCGG GTCGCCGGGT TCCACGGCCA GGACCACGAC CGGGGGGTGC TCGCCGTCGG 
yQ 34 8 61 GCACGTCGGC GAGGTACGTC CAGTCGGGGA CGGGTGACGC GGGCACGGGC ACCCAGGCGA 

g 34 921 TCTCGAACAG CGCCTCGGCA TCGGGGTCGG CGGCCCGCAC GGTCAGGCTG TCGACGTCAA 

Q 34 981 GGACCGGTGA GCCGTGCTCG TCCGTGGCGA CGATGCGGAC CATGTCGGGG CCGACGCGTT 

y 35041 CCAGCAGCAC GCGCAGCGCG GTCGCGGCGC GCGCGTGGAT CCTCACGCCG GACCAGGAGA 

£; 25. 35101 ACGCCAGCCG GCGCCGCTCC GGGTCCGTGA AGACCGTCCC GAGGGCGTGC AGGGCCGCGT 
«J • 35161 CGAGCAGCAC GGGGTGCAGC CCGTACCGGG CGTCGGTGAG CTGTTCGGCG AGGCGGACCG 

y> 35221 ACGCGTAGGC GCGGCCCTCC CCCGTCCACA TCGCGGTCAT GGCCCGGAAC GCGGGCCCGT 

B _ 35281 ACGAGAGCGG CAGCGCGTCG TAGAAGCCGG TCAGGTCGGC CGGGTCGGCG TCGGCGGGCG 

O 35341 GGCAGTCCAC GGGCTCCGCC GGACCGCCAG TGTCCACGCT CAGCGCTCCG GTCGCACTGA 

00 30 354 01 GCGCCCAGGG GCCCGTGCCG GTACGGCTGT GCAGACTCAC CGACCGCCGT CCGGACACCT 
fU 354 61 CGGTTCCGAC GGTGGCCTGG ATCTCCGTGT CGCCGTCGCC GTCGACCACC ACCGGCGCGA 

Sj 35521 CGATGGTCAG CTCCGCGATC TCCGGCGTGC CGAGCCGGGC TCCCGCTTCG GCGAGCAGTT 

Q 35581 CCACGAGCGC CGAGCCGGGC ACGATGACCC GGCCGTCCAC CTCGTGGTCG GCGAGCCAGG 

35641 GCTGACGGCG TACCGAGACA CCGCGGTGGC CAGCGCGCCC TCGCCGTCGG GCGAGGTCGA 
35 357 01 CCCACGAGCC GAGCAGCGGG TGGCCGGACG TTCCCGCCGG TTCCGCGTCG ATCCAGTAGC 
357 61 GGTCACGGCG GAACGGGTAC GTGGGCAGCG GCACCACCCG ACGCGTCGCG AACGACCAGG 
35821 TGACGGGCAC GCCCCGGACC CAGAGCGCGG CGAGCGACCG AGTGAAGCGG TCCAGGCCGC 
35881 CCTCGCCTCG CCGCAGTGTG CCGGTGACGA CCGTATGCGC ATGCCCGGCG AGCGTGTCCT 
35941 CCAGTGCGGT GGTGAGCACG GGATGCGCGC TGACCTCGAC GAACGCGCGG TATCCGCGGT 
40 36001 CCGCCAGGTG GCCGGTCGCG GCGGCGAACC GAACGGTGCG GCGCAGGTTG TCGTACCAGT 
36061 AGGCGGCGTC CGCGGGCCGG TCCAGCCACG CCTCGTCCAC GGTGGAGAAG AACGGGACGT 
36121 CCGGCGTGCG CGGAGTGATG CCGGCGAGAG CGTCGAGCAG CGCGCCGCGG ATCGTTTCGA 
36181 CATGCGCGGT GTGCGACGCG TAGTCGACGG CGATCCGGCG GGCGCGGGGG GTGGCGGCCA 
36241 GCAGCTCCTC CACGGCGTCG GCCGCACCGG CGACAACGAT CGACGCGGGT CCGTTGACCG 
45 36301 CGGCGACCTC CAGGCGCCCG GCCCACACGG CGGCGTCGAA GTCGGCGGGC GGCACCGAGA 
36361 CCATGCCGCC CTGCCCGGCC AGTTCGGTGG CGACGAGTCG GCTGCGCACC GCGACGACCT 
364 21 TCGCGGCGTC GTCCAGGGTG AGCACCCCGG CGACGCAGGC CGCGGCGACT TCGCCCTGGG 
364 81 AGTGGCCGAC GACCGCGGCC GGGGCGACCC CGTGCGCACG CCACAGCTCC GCCAGCGCCA 
36541 CCATCACCGC GAACGACGCG GGCTGCACGA CATCGACCCG GTCGAACGCG GGCGCTCCGG 
50 36601 GCCGCTGGGC GATGACGTCC AGCAGGTCCC ATCCGGTGTG CGGGGCGAGC GCCGTGGCGC 
36661 ACTCGCGGAG CCGCCGGGCG AACACGGGCT CGGTGGCGAG CAGTTCGGCA CCCATGCCGG 
36721 CCCACTGGGA GCCCTGCCCG GGGAACGCGA ACACGACACG TGTGTCGGTG ACGTCGGCGG 
36781 TTCCCGTCAC GGCCCCCGGC ACTTCGGCAC CACGGGCGAA CGCCTCCGCC TCTCGGGCCG 
36841 GCACGACCGC CCGGTGGCGC ATGGCCGTCC GGGTGGTGGC GAGCGAGTGG CCGACCGCGG 
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36901 CCGCGGCGCC AGTGAGCGGG GCCAGCTGTC CCGCGACGTC CCGCAGTCCC TCCGGGGTCC 
36961 GGGCCGACAT CGGCCAGACC ACGTCCTCGG GCACCGGCTC GGCTTCGGGT GCGGACACGG 
37 021 GTGCGGGCGC GGCGGGGGGC CCGGCCTCCA GGACGACATG GGCGTTGGTG CCGCTGATGC 
37081 CGAACGACGA. GACACCCGCA CGCCGGGCGC GCCCGGTGAC CGGCCACGGC TCACTGCGGT 
5 37141 GCAGCAGCCG GATGTCGCCG TCCCAGTCGA CGTGCCGGGA CGGCTCGTCG ACGTGCAGCG 
37201 TGCGCGGCAG GACGCCGTGC CGCATCGCCA TGACCATCTT GATGACGCCG GCGACGCCGG 
372 61 CCGCGGCCTG GGTGTGGCCG ATGTTCGACT TGAGCGAGCC GATCAGCAGC GGATGCACGC 
37321 GTTCGCGCCC GTAGGCCACT TGCAGGGCCT GGGCCTCGAC GGGGTCGCCG AGACGGGTGC 
37 381 CGGTGCCGTG TGCCTCCACG GCGTCGACGT CACCCGGCGC CAGGCCGGCG TCGGCGAGCG 

10 374 41 CACGCTGGAT GACGCGCTGC TGCGCAGGCC CGTTCGGGGC GGACAGCCCG TTCGACGCGC 
37501 CGTCGGAGTT GACCGCGGAG CCGCGCACCA GCGCCAGCAC GGGGTGGCCG TGGCGGGTGG 
37 561 CGTCGGAGAG CCGCTCCAGC ACCAGGACAC CGGCGCCCTC GGCGAAGCTC GTGCCGTCCG 
37 621 CGGTGTCCGC GAAGGCCTTG GCACGGCCGT CGGGGGCGAG CCCGCGCTGC CGGGAGAACT 
37 681 CGACGAACCC GGTCGTCGTC GCCATCACCG TGACACCGCC GACCAGGGCG AGCGAGCACT 

15 377 41 CCCCCGAGCG CAGCGACCGC GCGGCCTGGT GCAGCGCCAC CAGCGACGAC GAACACGCCG 
37801 TGTCGACGGT GACCGACGGG CCCTCCAGAC CGAAGTAGTA CGAGAGCCGC CCGGAGAGAA 

37 8 61 CGCTGG1CGG CGTGCCGGTC GCCCCGAAAC CGCCCAGGTC CACGCCCGCG CCGTAGCCCT 
^ 37 921 GGGTGAACGC GCCCATGAAT ACGCCGGTGT CGCTGCCGCG GACGCTTTCG GGCAGGATGC 
U 37 981 CCGCTCGTTC GAACGCCTCC CACGACGCTT CGAGGACCAG ACGCTGCTGC GGGTCCATCG 
y3 20 38041 CCAGCGCCTC ACGCGGGCTG ATCCCGAAGA ACGCGGCGTC GAAGTCGGCG GCGCCGGTGA 
yS 38101 GGAAGCCGCC GTGACGCACG GAAACCTTGC CGACCGCGTC GGGGTTCGGG TCGTAGAGCG 
^ 38161 CGGCGAGGTC CCAGCCGCGG TCGGCGGGGA ACTCGGTGAT CGCGTCCCCG CCGGAGTCGA 
0 38221 CCAGCCGCCA CAGGTCCTCC GGTGACCGCA CGCCACCGGG CATCCGGCAC GCCATGGCCA 
hj 38281 CGATCGCCAG CGGCTCGTTC CCCGCCACCG TCGGTGCGGG CACTGTCGCC GCCGGAGCGG 

2 25 38341 CAGGGGCCGG CTCACCCCGC CGTTCCTCAT CCAGGCGGGC GGCGAGCGCG GCCGGTGTCG 

38 4 01 GGTGGTCGAA GACGGCCGTC GCGGAGAGCC GTACCCCCGT CGTCTCGGCG AGGCTGTTGC 
384 61 GCAACCGGAC ACCGCTGAGC GAGTCGATGC CGAGGTCCTT GAACGCCGTC GTGGGCGTGA 

!L 38 521 TCTCGGAGGC GTCGGCGTGG CCGAGCACGG CGGCCGTGGC CGCACACACG ATGGCCAGCA 

W 38581 GG T C AC GAT C GCGGTCGCGG TCGCGGTCGC GGTTGTCCTC CGCACGGGCG GCGATGCGGC 

03 30 38 641 GCTCGGTCCG CTGCCGGACG GGCTCGGTGG GAATCGCCGC GACCATGAAC GGCACGTCCG 
fU 387 01 CGGCGAGGCT CGCGTCGATG AAGTGGGTGC CCTCGGCCTC GGTGAGCGGC CGGAACCCGT 
Sj 387 61 CGCGCACCCG CTGCCGGTCG GCGTCGTCAA GTTGTCCGGT GAGGGTGCTG GTGGTGTGCC 
Q 38821 ACATGCCCCA GGCGATGGAG GTGGCGGGTT GGCCGAGGGT GTGGCGGTGG GTGGCGAGGG 
y k 38881 CGTCGAGGAA GGCGTTGGCG GCGGCGTAGT TTCCTTGTCC GGGGCTGCCG AGGACGGCGG 

35 38 941 CGGCGCTGGA GTAGAGGACG AAGTGGGTGA GGGGTTGGTT TTGGGTGAGG TGGTGCAGGT 
39001 GCCAGGCGGC GTTGGCTTTG GGGTGGAGGA CGGTGGTGAG GCGGTCGGGG GTGAGGGCGT 
39061 CGAGGATGCC GTCGTCGAGG GTGGCGGCGG TGTGGAAGAC GGCGGTGAGG GGTTGGGGGA 
39121 TGTGGGCGAG GGTGGTGGCG AGTTGGTGGG GGTCGCCGAC GTCGCAGGGG AGGTGGGTGC 
39181 CGGGGGTGGT GTCGGGGGGT GGGGTGCGGG AGAGGAGGTA GGTGTGGGGG TGGTTCAGGT 

40 39241 GGCGGGCGAG GATGCCGGCG AGGGTGCCGG AGCCGCCGGT GATGATGATG GCGTGTTCGG 
39301 GGTTGAGGGG GGTGGTGGTG GGTGGGGTGG TGGTGTGGAG GGGGGTGAGG TGGGGTCGGT 
39361 GGAGGGTGTG GTGGGTGAGG CGGAGGTGGG GGTGGTCGAG GGTGGCGAGT TGGGCCAGGG 
394 21 GGAGGGGAGT GTGGGGGTGG TCGGTTTCGA TGAGGCGGAT GCGGTGGGGG TGTTCGTTCT 
39481 GGGCGGTGCG GGTGAGGCCG GTGACGGTGG CGCCGGCGGG GTCGGTGGTG GTGTGGACGA 

45 39541 TGAGGGTGTG GTCGGTGGTG GTGAGGTGGT GTTGCAGGGC GGTCAGGACG CGGGTGGCGC 
39601 GGGTGTGGGC GCGGGTGGGT ATGTCCTCGG GGTCGTCGGG GTGGGCGGCG GTGATCAGGA 
39661 CGTGTCCCTC GGGCAGGTCA CCGTCGTAGA CCGCCTCGGC GACCGCGAGC CACTCCAACC 
39721 GGAGCGGGTT CGGCCCCGAC GGGGTGTCGG CCCGCTCCCT CAGCACCAGC GAGTCCACCG 
39781 ACACGACAGG ACGGCCATCC GGGTCGGCCA CGCGCACGGC GACGCCGGCC TCCCCCCGGG 

50 39841 TGAGGGCGAC GCGCACCGCG GCGGCCCCGG TGGCGTTCAG GCGCACGCCC GTCCAGGAGA 
39901 ACGGCAGCTC GATCCCGCCG CCCGCGTCGA GGCGCCCGGC GTGCAGGGCC GCGTCGAGCA 
39961 GTGCCGGATG CACACCGAAA CCGTCCGCCT CGGCGGCCTG CTCGTCGGGC AGCGCCACCT 
4 0021 CGGCATACAC GGTGTCACCA TCACGCCAGG CAGCCCGCAA CCCCTGGAAC GCCGACCCGT 
4 0081 ACTCATAACC GGCATCCCGC AGTTCGTCAT AGAACCCCGA GACGTCGACG GCCGCGGCCG 
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4 0141 TGGCCGGCGG CCACTGCGAG AACGGCTCAC CGGAAGCGTT GGAGGTATCC GGGGTGTCGG 
4 0201 GGGTCAGGGT GCCGCTGGCG TGCCGGGTCC AGCTGCCCGT GCCCTCGGTA CGCGCGTGGA 
4 0261 CGGTCACCGG CCGCCGTCCG GCCTCATCGG CCCCTTCCAC GGTCACCGAC ACATCCACCG 
4 0321 CTGCGGTCAC CGGCACCACG AGCGGGGATT CGATGACCAG TTCATCCACC ACCCCGCAAC 
4 0381 CGGTCTCGTC ACCGGCCCGG ATGACCAGCT CCACAAACGC CGTACCCGGC AGCAGAACCG 
4 0441 TGCCCCGCAC CGCGTGATCA GCCAGCCAGG GATGCGTACG CAATGAGATC CGGCCGGTGA 
4 0501 GAACAACACC ACCACCGTCG TCGGCGGGCA GTGCTGTGAC GGCGGCCAGC ATCGGATGCG 
4 0561 CCGCCCCGGT CAGCCCGGCC GCGGACAGGT CGGTGGCACC GGCCGCCTCC AGCCAGTACC 
4 0621 GCCTGTGCTC GAACGCGTAG GTGGGCAGAT CCAGCAGCCG CCCCGGCACC GGTTCGACCA 
4 0681 CCGTGCCCCA GTCCACCCCC GCACCCAGAG TCCACGCCTG CGCCAACGCC CCCAGCCACC 
4 0741 GCTCCCAGCC ACCGTCACCA GTCCGCAACG ACGCCACCGT GCGGGCCTGT TCCATCGCCG 
4 0801 GCAGCAGCAC CGGATGGGCA CTGCACTCCA CGAACACCGA CCCGTCCAGC TCCGCCACCG 
4 08 61 CCGCATCCAG CGCGACAGGG CGACGCAGGT TCCGGTACCA GTACCCCTCA TCCACCGGCT 
4 0921 CGGTCACCCA GGCGCTGTCC ACGGTCGACC ACCACGCCAC CGACCCGGTC CCGCCGGAAA 
4 0981 TTCCCTTCAG TACCTCAGCG AGTTCGTCCT CGATGGCCTC CACGTGAGGC GTGTGGGAGG 
41041 CGTAGTCGAC CGCGATACGA CGCACCCGCA CCCCATCAGC CTCATACCGC GCCACCACCT 
41101 CCTCCACCGC CGACGGGTCC CCCGCCACCA CCGTCGAAGC CGGACCATTA CGCGCCGCGA 
41161 TCCACACACC CTCGACCAGA CCCACCTCAC CGGCCGGCAA CGCCACCGAA GCCATCGCCC 
41221 CCCGGCCGGC CAGCCGCGCC GCGATCACCC GACTGCGCAA CGCCACCACG CGGGCGGCGT 
41281 CCTCCAGGCT - GAGGGCTCCG GCCACACACG CCGCCGCGAT CTCCCCCTGC GAGTGTCCGA 
41341 CCACAGCGTC CGGCACGACC CCATGCGCCT GCCACAGCGC GGCCAGGCTC ACCGCGACCG 
414 01 CCCAGCTGGC CGGCTGGACC ACCTCCACCC GCTCCGCCAC ATCCGACCGC GACAACATCT 
414 61 CCCGCACATC CCAGCCCGTG TGCGGCAACA ACGCCCGCGC ACACTCCTCC ATACGAGCCG 
41521 CGAACACCGC GGAACGGTCC ATGAGTTCCA CGCCCATGCC CACCCACTGG GCACCCTGCC 
41581 CGGGGAAGAC GAACACCGTA CGCGGCTGAT CCACCGCCAC ACCCATCACC CGGGCATCAC 
41641 CCAGCAGCAC CGCACGGTGA CCGAAGACAG CACGCTCACG CACCAACCCC TGCGCGACCG 
41701 CGGCCACATC CACCCCACCC CCGCGCAGAT ACCCCTCCAG CCGCTCCACC TGCCCCCGCA 
417 61 GACTCACCTC ACCACGAGCC GACACCGGCA ACGGCACCAA CCCATCACCA CCCGACTCCA 
41821 CACGCGACGG CCCAGGAACA CCCTCCAGGA TCACGTGCGC GTTCGTACCG CTCACCCCGA 
41881 ACGACGACAC ACCCGCATGC GGTGCCCGAT CCGACTCGGG CCACGGCCTC GCCTCGGTGA 
41941 GCAGCTCCAC CGCACCGGCC GACCAGTCCA CATGCGACGA CGGCTCGTCC ACGTGCAGCG 
4 2001 TCTTCGGCGC GATCCCATGC CGCATCGCCA TGACCATCTT GAT G AC AC CG GCGACACCCG 
42061 CAGCCGCCTG CGCATGACCG ATGTTCGACT TGACCGAACC GAGGTAGAGC GGCGTGTCGC 
4 2121 GGTCCTGCCC GTAGGCCGCG AGGACGGCCT GCGCCTCGAT CGGGTCGCCC AGCCGCGTGC 
4 2181 CGGTGCCGTG CGCCTCCACC ACGTCCACAT CGGCGGCGCG CAGTCCGGCG TTGACCAACG 
42241 CCTGCCGGAT CACGCGCTGC TGGGCGACGC CGTTGGGGGC GGACAGTCCG TTGGAGGCAC 
4 2301 CGTCCTGGTT CACCGCCGAG CCGCGGACGA C CG CG AG AAC GGTGTGCCCG TTGCGCTCGG 
4 2361 CGTCGGAGAG CCGCTCCAGC ACGAGAACGC CGACGCCCTC GGCGAAGCCG GTCCCGTCCG 
4 24 21 CCGCGTCGGC GAACGCCTTG CACCGTCCGT CCGGGGAGAG TCCGCGCTGC CGGGAGAACT 
4 2481 CCACGAGCTC TGCGGTGTTC GCCATGACGG TGACACCGCC GACCAGCGCC AGGGAGCACT 
4 2541 CCCCGGCCCG CAGTGCCTGT GCCGCCTGGT GCAGGGCGAC CAGCGACGAC GAGCACGCCG 
4 2 601 TGTCGACCGT GACCGCCGGG CCCTGAAGTC CGTACACGTA CGAGAGGCGC CCGGACAGGA 
4 2661 CGCTCGTCTG CGTCGCCGTG ACACCGAGCC CGCCCAGGTC CCGGCCGACG CCGTAGCCCT 
4 2721 GGTTGAACGC GCCCATGAAC ACGCCGGTGT CGCTCTCCCG GAGCCTGTCC GGCACGATGC 
4 2781 CGGCGTTCTC GAACGCCTCC CAGGAGGTCT CCAGGATCAG GCGCTGCTGG GGGTCCATCG 
4 2841 CCAGCGCCTC GTTCGGACTG ATGCCGAAGA ACGCGGCGTC GAACCCGGCG CCGGCCAGGA 
4 2 901 ATCCGCCGTG GCGTGTCGTG GAGCGGCCGG CCGCGTCCGG GTCCGGGTCG TACAGCGCGT 
4 2961 CGACGTCCCA GCCCCGGTCG GTGGGGAACT CGGTGATCGC CTCGGTACCG GCGGCGACGA 
4 3021 GCCGCCACAG GTCCTCCGGC GAGGCGACCC CGCCGGGCAG TCGGCACGCC ATGCCGACGA 
4 3081 TCGCGACGGG GTCGCCGGAG CCGAGGGTCT GGGCGGTCGC GGGTGCCGCT GTCGCGGAGC 
4 3141 CGGCGAGGTG GGCGGCGAAC GCACGCGGAG TGGGGTGGTC GAACGCGGTT GACGCGGGCA 
4 3201 CCCGCAGACC CGTCCGCGCG GCGACGGTGT TGGTGAACTC GACGGTGGTG AGCGAGTCGA 
4 3261 GGCCGTTCTC GCGGAACGTG CGGTCCGGGG AGCAGTGTCC GGCGCCCGGC AGGCCCAGGA 
4 3321 CGGTGGCGAC GCTGTCGCGG ACCAGGTCGA GCAGTACGTC CTCCCGGCCC GCACGGGCCG 
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4 3381 CGGCGAGGCG GTTCGCCCAC TCCTGTTCCG TGGCGTCGGG CTCGGCCGGT CCGGTCAGTG 
4 3441 CGGTGAGGAT CGGCGGCGTG GCGCCCGCCA TCGTCGCGGC CCGCGCCCCG GCGGAACCGG 
4 3501 TCCGGGCCAC GATGTACGAG CCGCCGCCCG CGATGGCCTT CTCGATCAGG TCGCCGGTGA 
4 3561 GCGCCGGCCG TTCGATGCCG GGCAGCGCGC GGACGGTGAC GGTGGGGAGT CCCTCCGCGG 
5 4 3621 CCCGTGGCCG GGTGTGGGCG TCGGCGCCGG CCGGGCCGTC GAGCAGGACG TGCACGAGCG 
4 3681 CGCCGGGGTT CGCGGCTTCC TCGGCTGCGG TGGTCACGTG GGTGAGGCCG GTCTCGTCGC 
4 3741 GGAGCAGGCC GGCGACGGTG TCGGCGTCCT CCCCGGTGAC CAGGACCGGC GCGTCCGGGC 
4 3801 CGATCGGAGG CGGCACGGTG AGGACCATCT TGCCGGTGTG CCGGGCGTGG CTCATCCACG 
4 38 61 CGAACGCGTC CCGCGCACGG CGGATGTCCC ACGGCTGCAC CGGCAGCGGG CACAGCTCAC 
10 4 3921 CGCGGTCGAA CAGGTCGAGG AGCAGTTCGA GGATCTCCCG CAGGCGCGCG GGATCCACGT 
43981 CGGCCAGGTC GAACGGCTGC TGGGCGGCGT GGCGGATGTC 'GGTCTTGCCC ATCTCGACGA 
4 4 041 ACCGGCCGCC CGGTGCGAGC AGGCCGATGG ACGCGTCGAG GAGTTCACCG GTGAGCGAGT 
4 4101 TGAGCACGAC GTCGACCGGC GGGAAGGTGT CGGCGAACGC GGCGCTGCGG GAGTTCGCCA 
4 4161 CATGGTCGGT GTCGAAGCCG TCGGCGTGCA GCAGGTGTTG TTTGGCGGGA CTGGCGGTGG 
15 4 4221 CGTACACCTC GGCGCCGAGG TGGCGGGCGA TCCGGGTCGC CGCCATGCCG ACACCGCCCG 
4 4 281 TCGCGGCGTG GACCAGGACC TTCTGGCCGG GTCGCAGCTC GCCCGCGTCG ACGAGGCCGT 
4 4 341 ACCAGGCGGT GGCGAACACG ATGGGCACGG ACGCGGCGAT GGGGAACGAC CATCCCCGTG 
^ 4 4 401 GGATCCGTGC GACCAGCCGC CGGTCCGCGA CCACGCTGCG CCGGAACGCG TCCTGCACGA 

U 4 44 61 GACCGAACAC GCGGTCGCCG GGGGCCAGGT CGTCGACGCC GGGTCCGACT TCGGTCACGA 

%0 20 4 4 521 TGCCCGCGGC CTCCCCGCCC ATCTCGCCCT CGCCCGGGTA GGTGCCGAGC GCGATCAGCA 
£j 4 4 581 CGTCGCGGAA GTTCAGCCCC GCGGCGCGGA CGTCGATGCG GACCTCGCCG GCGGCCAGGG 

«p 4 4 641 GCGCGGCGGG ACGTCGAGCG GGGCGACGAC GAGGTCGCGG AGCGTTCCGG AGGCGGGCGG 

Q 4 4 701 GCGCAGCGCC CACTGGCGCG GTCGGCAGGG GGGTGGTGTC CGCGCGTACC AGCCGGGGCA 

y 447 61. CGTAGGCCAC GCCGGCCCGC AGCGCGATCT GGGGTTCGCC GAGCGAGGCC GCGGCGGGGA 

25 4 4 821 CGAGGTCGTC ATCGCCGTCC GTGTCCACCA GCACGAACGA TGCGGGTTCG GCGGCCTGGC 
° 4 4 881 GGCGCAGCGC CTCGTCCCAG AGCCGGGCCT GGTCCGCGTC CGGGATCTCG GCCGGGCCGA 

M 4 4 941 CGCCCACCGC GCGGCGGGTG ACGACCGTCC GGCGGGGTGA CGGGGTGCCG GGCAGGTCGC 

^ 4 5001 GCCGCTCCCA GACCAGTTCG CACAGCGTGG' CCTCGCCACT GCCGGTGGCG ACCAGATGGG 

*3 4 5061 CCGGCAGCCC CGCGAGCCGC GCGCGCTGGA CCTTGCCCGA CGCGGTGCGG GGGATCGTGG 

OS 30 4 5121 TGACGTGCCA GATCTCGTCG GGCACCTTGA AGTAGGCGAG CCGGCGGCGG CACTCGGCGA 
rU 4 5181 GGATCGCCTC GGCGGGGACG CGGGGGCCGT CGGAAACGAC GTAGAGCACG GGTATGTCGC 

Sj 45241 CGAGGACGGG GTGCGGGCGG -CCCGCCGCGG CGGCGTCCCG GACACCGGCC ACCTCCTGGG 

p 4 5301 CGACGGTCTC GATCTCCCGG GGGTGGATGT TCTCCCCGCC GCGGATGATC AGCTCCTTGA 

p, 4 5361 CCCGGCCGGT GATCGTCACG TGTCCGGTCT CGGCCTGACG TGCGAGGTCC CCGGTGCGGT 

35 4 5421 ACCAGCCGTC CACGAGCACC TGGGCGGTCG CCTCCGGCTG GGCGTGGTAG CCGAGCATGA 
4 5481 GGCTCGGCCC GCTCGCCCAC AGCTCGCCCT CCTCGCCGGG TGCCACGTCG GCGCCGGACA 
4 5541 CCGGGTCGAC GAACCGCAGC GACAGGCCCG GCACGGGCAG CCCGCACGAG CCGGGAACCC 
4 5601 GCGCATCCTC CAGGGTGTTG GCGGTGAGCG AGCCGGTCGT CTCGGTGCAG CCGTACGTGT 
4 5661 CGAGCAGGGG CACGCCGAAC GTCGCCTCGA AATCCCTGGT GAGCGACGCC GGCGAGGTGG 
40 4 5721 ATCCGGCGAC CAGCGCCACG CGCAGCGCGC GAGCCCGCGG CTCGCCGGAC ACGGCGCCGA 
4 5781 GGAGGTAGCG GTACATCGTC GGCACGCCGA CGAGCACGGT GCTGGAGTGT TCGGCCAGGG 
4 5841 CGTCGAGGAC GTCACGCGCG ACGAAGCCGC CCAGGATACG GGCGGACGCG CCGACCGTGA 
4 5901 GGACGGCGAG CAGGCAGAGG TGGTGGCCGA GGCTGTGGAA CAGCGGGGCG GGCCAGAGCA 
4 5961 GTTCGTCGTC CTCGGTCAGC CGCCAGGACG GCACGTCGCA GTGCATCGCG GACCACAGGC 
45 4 6021 CGCTGCGCTG TGCGGAAACC ACGCCCTTGG GACGGCCGGT GGTGCCGGAG GTGTAGAGCA 
4 6081 TCCAGGCGGG TTCGTCCAGG CCGAGGTCGT CGCGGGGCGG GCACGGCGGC TCGGTCCCGG 
4 6141 CGAGGTCCTC GTAGGAGACG CAGTCCGGTG CCCGGCGCCC GACGAGCACG ACGGTGGCGT 
4 6201 CGGTGCCGGT GCGGCGCACC TGGTCGAGGT GGGTTTCGTC GGTGACCAGC ACGGTCGCGC 
4 6261 CGGAGTCCGT CAGGAAGTGG GCGAGTTCGG CGTCGGCGGC GTCCGGGTTG AGCGGGACGG 
50 4 6321 CGACGGCGGC GGCGCGGGCG GCGGCGAGGT AGACCTCGAT GGTCTCGATC CGGTTGCCGA 
4 6381 GCAGCATCGC GAC'CCGGTCG CCGCGGTCGA CGCCGGACGC GGCGAGGTGT CCGGCGAGCC 
4 6441 GGCCGGCCCG GAGCCGGAGT TGCGTGTACG TCACGGCGCG TTGGGAATCC GTGTAGGCGA 
4 6501 TCCGGTCGCC GCGTCGCTCG GCATGGATGC GGAGCAATTC GTGCAACGGC CGGATTGGTT 
4 6561 CCACACGCGC CATGGAAACA CCTTTCTCTC GACCAACCGC ACAACAGCAC GGAACCGGCC 
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4 6621 AC GAG TAG AC GCCGGCGACG CTAGCAGCGT TTTCCGGACC GCCACCCCCT GAAGATCCCC 
4 6681 CTACCGTGGC CGGCCTCCCC GGACGCTCAT CTAGGGGGTT GCACGCATAC CGCCGTGCGT 
4 6741 AATTGCCTTC CTGATGACCG ATGCCGGACG CCAGGGAAGG GTGGAGGCGT TGTCCATATC 
4 6801 TGTCACGGCG CCGTATTGCC GCTTCGAGAA GACCGGATCA CCGGACCTCG AGGGTGACGA 
5 4 68 61 GACGGTGCTC GGCCTGATCG AGCACGGCAC CGGCCACACC GACGTGTCGC TGGTGGACGG 
4 6921 TGCTCCCCGG ACCGCCGTGC ACACCACGAC CCGTGACGAC GAGGCGTTCA CCGAGGTCTG 
4 6981 GCACGCACAG CGCCCTGTCG AGTCCGGCAT GGACAACGGC ATCGCCTGGG CCCGCACCGA 
47 041 CGCGTACCTG TTCGGTGTCG TGCGCACCGG CGAGAGCGGC AGGTACGCCG ATGCCACCGC 
4 7101 GGCCCTCTAC ACGAACGTCT TCCAGCTCAC CCGGTCGCTG GGGTATCCCC TGCTCGCCCG 
10 47161 GACCTGGAAC TACGTCAGCG GTATCAACAC GACGAACGCG GACGGGCTGG AGGTGTACCG 
■ 4 7221 GGACTTCTGC GTGGGCCGCG CCCAGGCGCT CGACGAGGGC GGGATCGACC CGGCCACCAT 
47281 GCCCGCGGCC ACCGGTATCG GCGCCCACGG GGGCGGCATC ACCTGCGTGT TCCTCGCCGC 
47341 CCGGGGCGGA GTGCGGATCA ACATCGAGAA CCCCGCCGTC CTCACGGCCC ACCACTACCC 
4 7401 GACGACGTAC GGTCCGCGGC CCCCGGTCTT CGCACGGGCC ACCTGGCTGG GCCCGCCGGA 
15 4 74 61 GGGGGGCCGG CTGTTCATCT CCGCGACGGC CGGCATCCTC GGACACCGAA CGGTGCACCA 
4 7521 CGGTGATGTG ACCGGCCAGT GCGAGGTCGC CCTCGACAAC ATGGCCCGGG TCATCGGCGC 
4 7 581 GGAGAACCTG CGGCGCCACG GCGTCCAGCG GGGGCACGTC CTCGCCGACG TGGACCACCT 
^ 47 641 CAAGGTCTAC GTCCGCCGCC GCGAGGATCT CGATACGGTC CGCCGGGTCT GCGCCGCACG 

□ " 4 7701 CCTGTCGAGC ACCGCGGCCG TCGCCCTTTT GCACACCGAC ATAGCCCGCG AGGATCTGCT 
'JS 20 4 77 61 CGTCGAAATC GAAGGCATGG TGGCGTGACA ATACCCGGTA AAAGGCCCGC GACGCTGCGC 
£ 47 821 CTCGGCGGAT CCGCGAAGAG AAAGAAGAGC GTCACCGCAC AGCGCGGCAG CCCGGTCCTT 

P 4 7881 TCGTCCTTCG CACAGCGGCG GATCTGGTTT CTCCAGCAAT TGGACCCGGA GAGCAACGCC 

Q 4 7 941 TATAATCTCC CGCTCGTGCA ACGCCTGCGC GGTCTATTGG ACGCGCCGGC CCTGGAGCGT 

2y 48001 GCGCTGGCGC TCGTCGTCGC GCGCCACGAG GCGTTGCGGA CGGTGTTCGA CACCGCCGAC 

^ 25 48061 GGCGAGCCCC TCCAGCGGGT GCTTCCCGCC CCGGAACACC TCCTGCGCCA CGCGCGGGCG 
4 8121 GGCAGCGAGG AGGACGCCGC CCGGCTCGTC CGCGACGAGA TCGCCGCGCC GTTCGACCTC 
4 8181 GCCACCGGGC CGTTGATCAG GGCCCTGCTG ATCCGCCTCG GTGACGACGA CCACGTTCTC 
L 4 8241 GCGGTGACCG TGCACCATGT CGCCGGCGAC GGCTGGTCGT TCGGGCTCCT CCAACATGAA 

S 4 8301 CTCGCAGCCC ACTACACGGC GCTGCGCGAC ACTGCCCGCC CTGCCGAACT GCCGCCGTTG 

80 30 4 8361 CCGGTGCAGT ACGCCGACTT CGCCGCCTGG GAGCGGCGCG AACTCACCGG CGCCGGACTG 
flj 4 8421 GACAGGCGTC TGGCCTACTG GCGCGAGCAA CTCCGGGGCG CCCCGGCGCG GCTCGCCCTC 

S] 4 8481 CCCACCGACC GTCCCCGCCC GCCGGTCGCC GACGCGGACG CGGGCATGGC CGAGTGGCGG 

p 4 8541 CCGCCGGCCG CGCTGGCCAC CGCGGTCCTC ACGCTCGCGC GCGACTCCGG TGCGTCCGTG 

J 4 8601 TTCATGACCC TGCTGGCGGC CTTCCAAGCG GTCCTCGCCC GGCAGGCGGG CACGCGGGAC 

35 4 8661 GTGCTGGTCG GCACGCCCGT GGCGAACCGT ACGCGGGCGG CGTACGAGGG CCTGATCGGC 
4 8721 ATGTTCGTCA ACACGCTCGC GCTGCGCGGC GACCTCTCGG GCGATCCGTC GTTCCGGGAA 
4 8781 CTCCTCGACC GCTGCCGGGC CACGACCACG GACGCGTTCG CCCACGCCGA CCTGCCGTTC 
4 8841 GAGAACGTCA TCGAACTCGT CGCACCGGAA CGCGACCTGT CGGTCAACCC GGTCGTCCAG 
4 8901 GTGCTGTTGC AGGTGCTGCG GCGCGACGCG GCGACGGCCG CGCTGCCCGG CATCGCGGCC 
40 4 8961 GAACCGTTCG GCACCGGACG CTGGTTCACC CGCTTCGACC TCGAATTCCA. TGTGTACGAG 
4 9021 GAGCCGGGTG GCGCGCTGAC CGGCGAACTG CTCTACAGCC GTGCGCTGTT CGACGAGCCA 
4 9081 CGGATCACGG GGTTGCTGGA GGAGTTCACG GCGGTGCTTC AGGCGGTCAC CGCCGACCCG 
4 9141 GACGTACGGC TGTCGCGGCT GCCGGCCGGC GACGCGACGG CGGCAGCGCC CGTGGTGCCC 
4 9201 TCGAACGACA CGGCGCGGGA CCTGCCCGTC GACACGCTGC CGGGCCTGCT GGCCCGGTAC 
45 4 9261 GCCGCACGCA CCCCCGGCGC CGTGGCCGTC ACCGACCCGC ACATCTCCCT CACCTACGCG 
4 9321 CAGCTGGACC GGCGGGCGAA CCGCCTCGCG CACCTGCTCC GCGCGCGCGG CACCGCCACC 
4 9381 GGCGACCTGG TCGGGATCTG CGCCGATCGC GGCGCCGACC TGATCGTCGG CATCGTGGGG 
4 9441 ATCCTCAAGG CGGGCGCCGC TTATGTGCCG CTGGACCCCG AACATCCTCC GGAGCGCACG 
4 9501 GCGTTCGTGC TGGCCGACGC GCAGCTGACC ACGGTGGTGG CGCACGAGGT CTACCGTTCC 
50 4 9561 CGGTTCCCCG ATGTGCCGCA CGTGGTGGCG TTGGACGACC CGGAGCTGGA CCGGCAGCCG 
4 9621 GACGACACGG CGCCGGACGT CGAGCTGGAC CGGGACAGCC TCGCCTACGC GATCTACACG 
4 9681 TCCGGGTCGA CCGGCAGGCC GAAGGCCGTG CTCATGCCGG GTGTCAGCGC CGTCAACCTG 
49741 CTGCTCTGGC AGGAGCGCAC GATGGGCCGC GAGCCGGCCA GCCGCACCGT CCAGTTCGTG 
4 9801 ACGCCCACGT TCGACTACTC GGTGCAGGAG ATCTTTTCCG CGCTGCTGGG CGGCACGCTC 
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49861 


GTCATCCCGC 


CGGACGAGGT 






49921 


CAGGCGATTA 


CCCGGATCTA 






49981 


GATCCGCACA 


GCGACCAGCT 






50041 


ATCCTCGACG 


CGCGGTTGCG 




5 


50101 


CACTACGGTC 


CGGCCGAAAG 






50161 


GCGTGGCCCG 


CCACCGCACC 






50221 


GACGAGGCGA 


TGCGGCCGGT 






50281 


GGCCTCGCCC 


GTGGGTACCT 






50341 


GATGCGGTCG 


GCGAGGAGCG 




10 


50401 


GGCGACCTGG 


AATTCCTCGG 






50461 


GAACCGGGTG 


AGATCGAGAG 






50521 


TCCGTGCGCG 


AGGACCGGCG 






50581 


GGCCGGCACG 


GCGACGACTT 






50641 


GCCGCGCTCG 


TGCCCTCCGC 




15 


50701 


AAGGTGGACC 


GGCGCGCGCT 






50761 


ACGCCCCGCA 


CCGATGCCGA 






50821 


CCGCGGGTCG 


GTGCCGACGA 






50881 


CGGGTCGTCT 


CCCGCATCCG 






50941 


GACGGGCGGA 


CGCCCGCCGC 


■ Ffc 


20 


51001 


CCCCCGATCG 


CGCCCTCCGC 






51061 


ATGCTGCACT 


CGCACGGCTC 


■P 




51121 


TTCCGGCTGC 


GCGGGCCACT 


0 




. 51181 


GCGCGCCACG 


AGCCGCTGCG 






51241 


GCTCCGGTGC 


GCGCCGAGGT 




25 


51301 


GTCGCCCACC 


GGGAGCTGAC 


on 




51361 


GTGCTGCTGC 


CGCTGGGCGC 




51421 


GGTGACGGAT 


GGTCCTTCGA 


E 




51481 


CCGGTGTCCT 


ACACGGACGT 






51541 


GAGAACGACC 


GGGCCTACTG 


00 


30 


51601 


GCGGTCCGGC 


CCGGCGGGGC 


HI 




51661 


GCCGTCCTGG 


CGGCACGCCG 






51721 


CTCGGCGCCT 


TCGCCCTGGT 


0 




51781 


ACGCCGTTCG 


CGGACCGGGG 






51841 


GTCCTCGCGC 


TGCGCCTCGA 




35 


51901 


GTGCACACCG 


CGATGGTGGG 






51961 


GCCGAGGACC 


CCGCGCTGCC 






52021 


GCGGAACTGC 


GGCTGCCCGG 






52081 


GACGAGATGA 


CCGGCGAACT 






52141 


GCGGTGGTCC 


ACGATGCCGC 




40 


52201 


GTGGAGGCGA 


CGCTGCGTGC 






52261 


GAAAGCGAGT 


AGCCATGCCC 






52321 


CGGAACTCCA 


GAAGACCCGT 






52381 


GGATGGCCTG 


CCGGCTGCCC 






52441 


AGTCCGGTGG 


CGACGGCATC 




45 


52501 


ACGGTCGCGG 


CGGCTTCCTC 






52561 


GCCCGCGCGA 


GGCGCTGGCG 






52621 


AGGCGTTCGA 


GCACGCGGGC 






52681 


TCCTCGGCGC 


GTTCTTCCAG 






52741 


CGAGCATTCA 


CACGAGCGTG 




50 


52801 


CGGCGGTCAC 


GGTCGACACG 






52861 


AGTCGCTGCG 


CTCCGGCGAA 






52921 


CGCCGGCGGG 


GTTCGCGGAC 






52981 


AGGCCTTCGC 


GGAAGCGGCT 






53041 


TCGAGAAGCT 


CTCCGACGCC 
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GCGGTTCGAC CCGCCGGGAC TCGCCCGGTG GATGGACGAA 
CGCGCCGACG GCCGTACTGC GCGCGCTGAT CGAGCACGTC 
CGCCGCCCTG CGGCACCTGT GCCAGGGCGG CGAGGCGCTG 
CGAGCTGTGC CGGCACCGGC CCCACCTGCG CGTGCACAAT 
CCAGCTCATC ACCGGGTACA CGCTGCCCGC CGACCCCGAC 
GATCGGCCCG CCGATCGACA ACACCCGCAT CCATCTGCTC 
TCCGGACGGT ATGCCGGGGC AGCTCTGCGT CGCCGGCGTC 
GGCCCGTCCC GAGCTGACCG CCGAGCGCTG GGTGCCGGGA 
CATGTACCTC ACCGGCGACC TGGCCCGCCG CGCGCCCGAC 
CCGGATCGAC GACCAGGTCA AGATCCGCGG CATCCGCGTC 
CCTGCTCGCC GAGGACGCCC GCGTCACGCA GGCGGCGGTG 
GGGCGAGAAG TTCCTGGCCG CGTACGTCGT ACCGGTGGCC 
CGCCGCGTCG CTGCGCGCGG GACTGGCCGC CCGGCTGCCC 
CGTCGTCCTG GTGGAGCGAC TGCCGAGGAC CACGAGCGGC 
GCCCGACCCG GAGCCGGGCC CGGCGTCGAC CGGGGCGGTT 
GCGGACGGTG TGCCGGATCT TCCAGGAGGT GCTCGACGTC 
CGACTTCTTC ACGCTCGGCG GGCACTCCCT GCTCGCCACC 
CGCCGAGCTG GGTGCCGATG TCCCGCTGCG TACGCTCTTC 
GCTCGCCCGT GCGGCGGACG AGGCCGGCCC GGCCGCCCTG 
GGAGAACGGG CCGGCCCCCC TCACCGCGGC ACAGGAACAG 
GCTGCTCGCC GCGCCCTCCT ACACGGTCGC CCCGTACGGG 
CGACCGCGAA GCGCTCGACG CGGCACTGAC CCGGATCGCC 
GACCGGGTTC CGCGATCGGG AACAGGTCGT CCGGCCGCCC 
GGTTCCGGTG CCGGTCGGCG ACGTCGACGC CGCGGTCCGG 
CCGGCCGTTC GACCTCGTGA ACGGGTCGTT GCTGCGTGCC 
CGAGGATCAC GTGCTGCTGC TGATGCTGCA CCACCTCGCC 
CCTCCTGGTC CGGGAGTTGT CGGGGACGCA ACCGGACCTT 
GGCCCGGTGG GAACGGAGTC CGGCCGTGAT CGCGGCCAGG 
GCGCCGGCGG CTGGGGGGCG CCACCGCGCC GGAGCTGCCC 
ACCGACCGGG CGGGCGTTCC TGTGGACGCT CAAGGACACC 
GGTCGCGGAC GCCCACGACG CGACGTTGCA CGAAACCGTG 
CGTGGCGGAG ACCGCCGACA CCGACGACGT GCTCGTCGCG 
GTACGCCGGG ACCGACCACC TCATCGGCTT CTTCGCGAAG 
CCTCGGCGGC ACGCCGTCGT TCCCCGAGGT GCTGCGCCGG 
CGCGCACGCC CACCAGGCGG TGCCCTACTC CGCGCTGCGC 
GCCGGCCCCC GTGTCGTTCC AGCTCATCAG CGCGCTCAGC 
CATGCACACC GAGCCGTTCC CCGTCGTCGC CGAGACCGTC 
GTCGATCAAC CTCTTCGACG ACGGTCGCAC CGTCTCCGGC 
GCTGCTCGAC CGTGCCACCG TCGACGATTT GCTCACCCGG 
CGCCGCGGGC GACCTCACCG TACGCGTCAC CGGTTACGTG 
GAGCAGGACA AGACAGTCGA GTACCTTCGC TGGGCGACCG 
GCGGAACTCG CCGCGCACAG CGAGCCGTTG GCGATCGTGG 
GGCGGGGTCG CGTCGCCGGA GGACCTGTGG CAGTTGCTGG 
ACCGCGTTCC CCACGGACCG GGGCTGGGAG ACCACCGCCG 
ACCGGGGCGG CCGGCTTCGA CGCGGCGTTC TTCGGCATCA 
ATGGACCCGC AGCAGCGCCT GGCCCTGGAG ACCTCGTGGG 
ATCGATCCGC AGACGCTGCG GGGCAGTGAC ACGGGGGTGT 
GGGTACGGCA TCGGCGCCGA CTTCGACGGT TACGGCACCA 
CTCTCCGGCC GCCTCGCGTA CTTCTACGGT CTGGAGGGTC 
GCGTGTTCGT CGTCGCTGGT GGCGCTGCAC CAGGCCGGGC 
TGCTCGCTCG CCCTGGTCGG CGGCGTCACG GTGATGGCCT 
TTCTCCGAGC AGGGCGGCCT GGCCCCCGAC GCGCGCTGCA 
GACGGCACCG GTTTCGCCGA GGGGTCCGGC GTCCTGATCG 
GAGCGCAACG GCCACCGCGT GCTGGCGGTC GTCCGGGGTT 
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53101 CCGCCGTCAA CCAGGACGGT GCCTCCAACG 
53161 AGCGGGTGAT CCGGCAGGCC CTGGCCAACG 
53221 TCGAGGCCCA CGGCACCGGC ACCAGGCTGG 
53281 CCACCTACGG GCAGGGGCGC GACACCCCTG 
53341 GCCACACCCA GGCCGCCGCG GGCGTCGCCG 
534 01 ACGGCACCCT GCCCCGCACC CTGCACGTGG 
534 61 CCGGCGCCGT CGAACTCCTC ACCGACGCCC 
53521 GCGCCGGTGT CTCCTCCTTC GGCGTCAGCG 
53581 ACCCCCGACC GGCCCCCGAA CCCGCCCCGG 
53641 TCTCGGCCCG CACCCCGCAG GCACTCGACG 
53701 ACGACAACCC CGGCGCGGAC CGGGTCGCCG 
537 61 TCGAGCACCG CGCCGTGCTG CTCGGCGACA 
53821 GCGGACCGGT GGTCTTCGTC TACTCGGGGC 
53881 AACTCGCGTC CACCTACCCC GTGTTCGCCG 
53941 ACCCCACCCA GGGCCCGGCC ACGCACTTCG 
54 001 GGTCCTGGGG CATCACCCCG CACGCGGTCA 
54061 CGCACGCCGC CGGTGTCCTG TCCCTGAGGG 
54121 GCCTGATGGA CCAACTGCCG TCGGGCGGCG 
54181 AGGCACGCCA GGTGCTGCGG CCGGGCGTGG 
54241 TCGTGCTGTC CGGGGACGAG GAAGCCGTAC 
54 301 ACCGCCTGCC GACCCGCCAC GCCGGCCACT 
54 361 TCCTCGACGT CGCCCGGACC CTGACGTACC 
54 4 21 CCACCACCGC CGAATACTGG GCGCACCAGG 
54 4 81 CCGAGCAGTA CCCGGGCGCG" ACGTTCCTCG 
54 541 TCGTCGACGG CGTTGCCGCC CAGACCGGTA 
54 601 CGCTCGCGCA GCTCCACGTC CGCGGCGTCG 
54 661 ACCGCGCGCC CGTCACGCTG CCCACGTATC 
54 721 CCACCTCCCG GGCCGATGTG ACCGGCGCGG 
54 781 GCGCCGCGGT CGCGCTGCCC GGCACGGGCG 
54 841 CCTCCCATCC GTGGCTCGGC GAGCACGCGG 
54 901 CCTTCCTCGA ACTCGCGGCG CGCGCCGGCG 
54 961 TCGTCATCGA GACGCCGCTC GTGCTGCCCG 
55021 TCGCCGAACC CGACGACACG GGGCGGCGGG 
55081 CGGGCCTGTG GACCCGACAC GCCGGCGGAT 
55141 CCACGGACCC GGCACCCTGG CCGCCCGCGG 
'55201 ACGACCGGTT CGAGGACATC GGGTACTCCT 
55261 CCTGGCGCGC CGGCGACACC GTGTACGCCG 
55321 ACGCCGCCCG TTTCACGCTG CACCCCGCGC 
55381 TGGCCGCGCT CGACGCACCC GGCGGGGCGG 
554 41 GCATCCACGC GGCCGGGGCG ACGCGGCTGC 
55501 GCACCGTCCG CATGACCGGC CCGGACGGGC 
55561 CGCGCCCGTA CGCGGAAGGC TCCGGTGACG 
55621 CGATGCCCGT CCCGTCCGCG GACGATCCGC 
55681 ACGGCGACGT TCCGGCGGCC ACCCGGGAGC 

557 41 GCCACCTGTC CGCCGCCGAG GACACCACCT 
55801 CTGCCGCCGC CGCGGGTCTG GTCCGCTCGG 

558 61 TCGTCGAGGC GTCCCCGGAC ACCTCGGTGG 
55921 AACCGCAGCT GGCCGTCCGG GACGGCGTGC 
55981 ACCCCGCGCA CGGCCCGCTG TCCCTGCCGG 
5 6041 CCGGCACGTT GCACGACGTC GCGCTCATAG 
56101 CCGGCGAGGT CCGCATCGAC GTCCGCGCGG 
56161 CGCTCGGGAC GTACACCGGG GCCACGGCCA 
56221 AGACCGGGCC CGGCGTGGAC GACCTGTCCC 
56281 GCGGCATCGG CCCGACGGCC GTCACCGACC 



GGCTGTCCGC GCCGAACGGG CCGTCGCAGG 
CCGGACTCAC CCCGGCGGAC GTGGACGCCG 
GCGACCCCAT CGAGGCACAG GCCGTGCTGG 
TGCTGCTGGG CTCGCTGAAG TCCAACATCG 
GTGTCATCAA GATGGTCCTC GCCATGCGGC 
ACACGCCGTC CTCGCACGTC GACTGGACGG 
GGCCCTGGCC CGAAACCGAC CGCCCACGGC 
GCACCAACGC CCACATCATC CTCGAAAGCC 
CACCCGACAC CGGACCGCTG CCGCTGCTGC 
CACAGGTACA CCGCCTGCGC GCGTTCCTCG 
TCGCGCAGAC ACTCGCCCGG CGCAGCCAGT 
CGCTCATCAC CGTGAGCCCG AACGCCGGCC 
AAAGCACGCT GCACCCGCAC ACCGGGCGGC 
AAGCGTGGCG CGAGGCCCTC GACCACCTCG 
CCCACCAGAC CGCGCTCACC GCGCTCCTGC 
TCGGCCACTC CCTCGGTGAG ATCACCGCCG 
ACGCGGGCGC GCTCCTCACC ACCCGCACCC 
CGATGGTCAC CGTCCTGACC AGCGAGGAAA 
AGATCGCCGC CGTCAACGGC CCCCACTCCC 
TCGAAGCCGC CCGGCAGCTC GGCATCCACC 
CCGAGCGCAT GCAGCCACTC GTCGCCCCCC 
ACCAGCCCCA CACCGCCATC CCCGGCGACC 
TCCGCGACCA AGTACGTTTC CAGGCGCACA 
AGATCGGCCC CAACCAGGAC CTCTCGCCGC 
CGCCCGACGA GGTGCGGGCG CTGCACACCG 
CGATCGACTG GACGCTCGTC CTCGGCGGGG 
CGTTCCAGCA CAAGGACTAC TGGCTGCGGC 
GGCAGGAGCA GGTGGCGCAC CCGCTGCTCG 
GAGTCGTCCT GACCGGCCGC CTGTCGCTGG 
TCGACGGCAC CGTGCTCCTG CCCGGCGCGG 
ACGAGGTCGG CTGCGACCTG CTGCACGAAC 
CGACCGGCGG TGTGGCGGTC TCCGTCGAGA 
CGGTCACCGT CCACGCGCGG GCCGACGGCT 
TCCTCGGCAC GGCACCGGCA CCGGCCACGG 
AAGCCGGACC GGTCGACGTC GCCGACGTCT 
ACGGACCGGG CTTCCGGGGG CTGCGGGCCG 
AGGTCGCGCT CCCCGACGAG CAGAGCGCCG 
TGCTCGACGC CGCGTTCCAG GCCGGCGCGC 
CCCGACTGCC GTTCTCGTTC CAGGACGTCC 
GGGTCACGGT CGGCCGCGAC GGCGAGCGCA 
AGCTGGTGGC CGTGGTCGGT GCCGTGCTGT 
GCCTGCTGCG CCCGGTCTGG ACCGAGCTGC 
GCGTGGAGGT CCTCGGCGCC GACCCGGGCG 
TGACCGCCCG CGTCCTCGGC GCGCTCCAGC 
TGGTGGTACG GACCGGCACC GGCCCGGCCG 
CGCAGGCGGA GAACCCCGGC CGCGTCGTGC 
AGCTGCTCGC CGCGTGCGCC GCGCTGGACG 
TCTTCGCGCC GCGGCTGGTC CGGATGTCCG 
ACGGCGACTG GCTGCTCACC CGGTCCGCCT 
CCGACGACAC GCCCCGGCGG GCGCTCGAAG 
CCGGACTGAA CTTCCGCGAT GTGCTGATCG 
TGGGCGGCGA GGCCGCGGGC GTCGTGGTGG 
CCGGCGACCG GGTGTTCGGC CTGACCCGGG 
GGCGCTGGCT GGCCCGGATC CCCGACGGCT 
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56341 GGAGCTTCAC CACGGCGGCG TCCGTCCCGA TCGTGTTCGC GACCGCGTGG TACGGCCTGG 
564 01 TCGACCTCGG CACACTGCGC GCCGGCGAGA AGGTCCTCGT CCACGCGGCC ACCGGCGGTG 
564 61 TCGGCATGGC CGCCGCACAG ATCGCCCGCC ACCTGGGCGC CGAGCTCTAC GCCACCGCCA 
56521 GTACCGGCAA GCAGCACGTC CTGCGCGCCG CCGGGCTGCC CGACACGCAC ATCGCCGACT 
5 56581 CTCGGACGAC CGCGTTCCGG ACCGCTTTCC CGCGCATGGA CGTCGTCCTG AACGCGCTGA 
56641 CCGGCGAGTT CATCGACGCG TCGCTCGACC TGCTGGACGC CGACGGCCGG TTCGTCGAGA 
56701 TGGGCCGCAC CGAGCTGCGC GACCCGGCCG CGATCGTCCC CGCCTACCTG CCGTTCGACC 
56761 TGCTGGACGC GGGCGCCGAC CGCATCGGCG AGATCCTGGG CGAACTGCTC CGGCTGTTCG 
56821 ACGCGGGCGC GCTGGAGCCG CTGCCGGTCC GTGCCTGGGA CGTCCGGCAG GCACGCGACG 
10 56881 CGCTCGGCTG GATGAGCCGC GCCCGCCACA TCGGCAAGAA CGTCCTGACG CTGCCCCGGC 
56941 CGCTCGACCC GGAGGGCGCC GTCGTCCTCA CCGGCGGCTC CGGCACGCTC GCCGGCATCC 
57 001 TCGCCCGCCA CCTGCGCGAA CGGCATGTCT ACCTGCTGTC CCGGACGGCA CCGCCCGAGG 
57061 GGACGCCCGG CGTCCACCTG CCCTGCGACG TCGGTGACCG GGACCAGCTG GCGGCGGCCC 
57121 TGGAGCGGGT GGACCGGCCG ATCACCGCCG TGGTGCACCT CGCCGGTGCG CTGGACGACG 
15 57181 GCACCGTCGC GTCGCTCACC CCCGAGCGTT TCGACACGGT GCTGCGCCCG AAGGCCGACG 
57241 GCGCCTGGTA CCTGCACGAG CTGACGAAGG AGCAGGACCT CGCCGCGTTC GTGCTCTACT 
57301 CGTCGGCCGC CGGCGTGCTC GGCAACGCCG GCCAGGGCAA CTACGTCGCC GCGAACGCGT 
57361 TCCTCGACGC GCTCGCCGAG CTGCGCCACG GTTCCGGGCT GCCGGCCCTC TCCATCGCCT 

57 421 GGGGGCTCTG GGAGGACGTG AGCGGGCTCA CCGCGGCGCT CGGCGAAGCC GACCGGGACC 
gg 20 574 81 GGATGCGGCG CAGCGGTTTC CGGGCCATCA CCGCGCAACA GGGCATGCAC CTGTACGAGG 
gQ 57541 CGGCCGGCCG CACCGGAAGT CCCGTGGTGG TCGCGGCGGC GCTCGACGAC GCGCCGGACG 
JP 57601 TGCCGCTGCT GCGCGGCCTG CGGCGGACGA CCGTCCGGCG GGCCGCCGTC CGGGAGTGTT 
Q 57 661 CGTCCGCCGA CCGGCTCGCC GCGCTGACCG GCGACGAGCT CGCCGAAGCG CTGCTGACGC 
=„j . 57721 _ TCGTCCGGGA GAGCACCGCC GCCGTGCTCG GCCACGTGGG TGGCGAGGAC ATCCCCGCGA 

25 * 57781 CGGCGGCGTT CAAGGACCTC GGCATCGACT CGCTCACCGC GGTCCAGCTG CGCAACGCCC 
JJ? 57841 TCACCGAGGC GACCGGTGTG CGGCTGAACG CCACGGCGGT CTTCGACTTC CCGACCCCGC 

V 1 57901 ACGTGCTCGC. CGGGAAGCTC GGCGACGAAC TGACCGGCAC CCGCGCGCCC GTCGTGCCCC 

s 57961 GGACCGCGGC CACGGCCGGT GCGCACGACG AGCCGCTGGC GATCGTGGGA ATGGCCTGCC 

Q 58021 GGCTGCCCGG CGGGGTCGCG T-CACCCGAGG AGCTGTGGCA CCTCGTGGCA TCCGGCACCG 

03 30 58081 ACGCCATCAC GGAGTTCCCG ACGGACCGCG GCTGGGACGT CGACGCGATC TACGACCCGG 
fU 58141 ACCCCGACGC GATCGGCAAG ACCTTCGTCC GGCACGGTGG CTTCCTCACC GGCGCGACAG 

: ~~\ 58201 GCTTCGACGC GGCGTTCTTC GGCATCAGCC CGCGCGAGGC CCTCGCGATG GACCCGCAGC 

Q 58261 AGCGGGTGCT CCTGGAGACG TCGTGGGAGG- CGTTCGAAAG CGCCGGCATC ACCCCGGACT 

^ 58321 CGACCCGCGG CAGCGACACC GGCGTGTTCG TCGGCGCCTT CTCCTACGGT TACGGCACCG 

35 58381 GTGCGGACAC CGACGGCTTC GGCGCGACCG GCTCGCAGAC CAGTGTGCTC TCCGGCCGGC 
584 41 TGTCGTACTT CTACGGTCTG GAGGGTCCGG CGGTCACGGT CGACACGGCG TGTTCGTCGT 
58501 CGCTGGTGGC GCTGCACCAG GCCGGGCAGT CGCTGCGCTC CGGCGAATGC TCGCTCGCCC 
58561 TGGTCGGCGG CGTCACGGTG ATGGCGTCTC CCGGCGGCTT CGTGGAGTTC TCCCGGCAGC 

58 621 GCGGCCTCGC GCCGGACGGC CGGGCGAAGG CGTTCGGCGC GGGTGCGGAC GGCACGAGCT 
40 58681 TCGCCGAGGG TGCCGGTGTG CTGATCGTCG AGAGGCTCTC CGACGCCGAA CGCAACGGTC 

58741 ACACCGTCCT GGCGGTCGTC CGTGGTTCGG CGGTCAACCA GGATGGTGCC TCCAACGGGC 
58801 TGTCGGCGCC GAACGGGCCG TCGCAGGAGC GGGTGATCCG GCAGGCCCTG GCCAACGCCG 
588 61 GGCTCACCCC GGCGGACGTG GACGCCGTCG AGGCCCACGG CACCGGCACC AGGCTGGGCG 
58 921 ACCCCATCGA GGCACAGGCG GTACTGGCCA CCTACGGACA GGAGCGCGCC ACCCCCCTGC 

45 58 981 TGCTGGGCTC GCTGAAGTCC AACATCGGCC ACGCCCAGGC CGCGTCCGGC GTCGCCGGCA 
59041 TCATCAAGAT GGTGCAGGCC CTCCGGCACG GGGAGCTGCC GCCGACGCTG CACGCCGACG 
59101 AGCCGTCGCC GCACGTCGAC TGGACGGCCG GCGCCGTCGA ACTGCTGACG TCGGCCCGGC 
59161 CGTGGCCCGA GACCGACCGG CCACGGCGTG CCGCCGTCTC CTCGTTCGGG GTGAGCGGCA 
59221 CCAACGCCCA CGTCATCCTG GAGGCCGGAC CGGTAACGGA GACGCCCGCG GCATCGCCTT 

50 59281 CCGGTGACCT TCCCCTGCTG GTGTCGGCAC GCTCACCGGA AGCGCTCGAC GAGCAGATCC 
59341 GCCGACTGCG CGCCTACCTG GACACCACCC CGGACGTCGA CCGGGTGGCC GTGGCACAGA 
59401 CGCTGGCCCG GCGCACACAC TTCGCCCACC GCGCCGTGCT GCTCGGTGAC ACCGTCATCA ' 
594 61 CCACACCCCC CGCGGACCGG CCCGACGAAC TCGTCTTCGT CTACTCCGGC CAGGGCACCC 
59521 AGCATCCCGC GATGGGCGAG CAGCTCGCCG CCGCCCATCC CGTGTTCGCC GACGCCTGGC 
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59581 ATGAAGCGCT CCGCCGCCTT GACAACCCCG ACCCCCACGA CCCCACGCAC AGCCAGCATG 
59641 TGCTCTTCGC CCACCAGGCG GCGTTCACCG CCCTCCTGCG GTCCTGGGGC ATCACCCCGC 
59701 ACGGGGTCAT CGGCCACTCG CTGGGCGAGA TCACCGCGGC GCACGCCGCC GGCATCCTGT 
59761 CGCTGGACGA CGCGTGCACC CTGATCACCA CGCGCGCCCG CCTCATGCAC ACGCTCCCGC 
5 59821 CACCCGGTGC CATGGTCACC GTACTGACCA GCGAAGAGAA GGCACGCCAG GCGTTGCGGC 
59881 CGGGCGTGGA GATCGCCGCC GTCAACGGGC CCCACTCCAT CGTGCTGTCC GGGGACGAGG 
5 9941 ACGCCGTGCT CACCGTCGCC GGGCAGCTCG GCATCCACCA CCGCCTGCCC GCCCCGCACG 
60001 CCGGGCACTC CGCGCACATG GAGCCCGTGG CCGCCGAGCT GCTCGCCACC ACCCGCGGGC 
60061 TCCGCTACCA CCCTCCCCAC ACCTCCATTC CGAACGACCC CACCACCGCT GAGTACTGGG 

10 60121 CCGAGCAGGT CCGCAAGCCC GTGCTGTTCC ACGCCCACGC GCAGCAGTAC CCGGACGCCG 
60181 TGTTCGTGGA GATCGGCCCC GCCCAGGACC TCTCCCCGCT CGTCGACGGG ATCCCGCTGC 
60241 AGAACGGCAC CGCGGACGAG GTGCACGCGC TGCACACCGC GCTCGCGCAC CTCTACGCGC 
60301 GCGGTGCCAC GCTCGACTGG CCCCGCATCC TCGGGGCTGG GTCACGGCAC GACGCGGATG 
60361 TGCCCGCGTA CGCGTTCCAA CGGCGGCACT ACTGGATCGA GTCGGCACGC CCGGCCGCAT 

15 604 21 CCGACGCGGG CCACCGCGTG CTGGGCTCCG GTATCGCCCT CGCCGGGTCG CCGGGCCGGG 
604 81 TGTTCACGGG TTCCGTGCCG ACCGGTGCGG ACCGCGCGGT GTTCGTCGCC GAGCTGGCGC 
60541 TGGCCGCCGC GGACGCGGTC GACTGCGCCA CGGTCGAGCG GCTCGACATC GCCTCCGTGC 
60601 CCGGCCGGCC GGGCCATGGC CGGACGACCG TACAGACCTG GGTCGACGAG CCGGCGGACG 
60661 ACGGCCGGCG CCGGTTCACC GTGCACACGC GCACCGGCGA CGCCCCGTGG ACGCTGCACG 

20 60721 CCGAGGGGGT GCTGCGCCCC CATGGCACGG CCCTGCCCGA TGCGGCCGAC GCCGAGTGGC 

607 81 CCCCACCGGG CGCGGTGCCC GCGGACGGGC TGCCGGGTGT GTGGCGCCGG GGGGACCAGG 

608 41 TCTTCGCCGA GGCCGAGGTG GACGGACCGG ACGGTTTCGT GGTGCACCCC GACCTGCTCG 
60901 ACGCGGTCTT CTCCGCGGTC GGCGACGGAA GCCGCCAGCC GGCCGGATGG CGCGACCTGA 
60961 CGGTGCACGC GTCGGACGCC ACCGTACTGC GCGCCTGCCT CACCCGGCGC ACCGACGGAG 

25 61021 CCATGGGATT CGCCGCCTTC GACGGCGCCG GCCTGCCGGT ACTCACCGCG GAGGCGGTGA 
61081 CGCTGCGGGA GGTGGCGTCA CCGTCCGGCT CCGAGGAGTC GGACGGCCTG CACCGGTTGG 
61141 AGTGGCTCGC GGTCGCCGAG GCGGTCTACG ACGGTGACCT GCCCGAGGGA CATGTCCTGA 
61201 TCACCGCCGC CCACCCCGAC GACCCCGAGG ACATACCCAC CCGCGCCCAC ACCCGCGCCA 
61261 CCCGCGTCCT GACCGCCCTG CAACACCACC TCACCACCAC CGACCACACC CTCATCGTCC 

30 61321 ACACCACCAC CGACCCCGCC GGCGCCACCG TCACCGGCCT CACCCGCACC GCCCAGAACG 
61381 AACACCCCCA CCGCATCCGC CTCATCGAAA CCGACCACCC CCACACCCCC CTCCCCCTGG 
61441' CCCAACTCGC CACCCTCGAC CACCCCCACC TCCGCCTCAC CCACCACACC CTCCACCACC 
61501 CCCACCTCAC CCCCCTCCAC ACCACCACCC CACCCACCAC CACCCCCCTC AACCCCGAAC 
61561 ACGCCATCAT CATCACCGGC GGCTCCGGCA CCCTCGCCGG CATCCTCGCC CGCCACCTGA 

35 61621 ACCACCCCCA CACCTACCTC CTCTCCCGCA CCCCACCCCC CGACGCCACC CCCGGCACCC 
61681 ACCTCCCCTG CGACGTCGGC GACCCCCACC AACTCGCCAC CACCCTCACC CACATCCCCC 

617 41 AACCCCTCAC CGCCATCTTC CACACCGCCG CCACCCTCGA CGACGGCATC CTCCACGCCC 
61801 TCACCCCCGA CCGCCTCACC ACCGTCCTCC ACCCCAAAGC CAACGCCGCC TGGCACCTGC 

618 61 ACCACCTCAC CCAAAACCAA CCCCTCACCC ACTTCGTCCT CTACTCCAGC GCCGCCGCCG 
'40 61921 TCCTCGGCAG CCCCGGACAA GGAAACTACG CCGCCGCCAA CGCCTTCCTC GACGCCCTCG 

61981 CCACCCACCG CCACACCCTC GGCCAACCCG CCACCTCCAT CGCCTGGGGC ATGTGGCACA 
62041 CCACCAGCAC CCTCACCGGA CAACTCGACG ACGCCGACCG GGACCGCATC CGCCGCGGCG 
62101 GTTTCCTCCC GATCACGGAC GACGAGGGCA TGCGCCTCTA CGAGGCGGCC GTCGGCTCCG 
62161 GCGAGGACTT CGTCATGGCC GCCGCGATGG ACCCGGCACA GCCGATGACC GGCTCCGTAC 

45 62221 CGCCCATCCT GAGCGGCCTG CGCAGGAGCG CGCGGCGCGT CGCCCGTGCC GGGCAGACGT 
62281 TCGCCCAGCG GCTCGCCGAG CTGCCCGACG CCGACCGCGG CGCGGCGCTG ACCACCCTCG 
62341 TCTCGGACGC CACGGCCGCC GTGCTCGGCC ACGCCGACGC CTCCGAGATC GCGCCGACCA 
624 01 CGACGTTCAA GGACCTCGGC ATCGACTCGC TCACCGCGAT CGAGCTGCGC AACCGGCTCG 
624 61 CGGAGGCGAC CGGGCTGCGG CTGAGTGCCA CGCTGGTGTT CGACCACCCG ACACCTCGGG 

50 62521 TCCTCGCCGC CAAGCTCCGC ACCGATCTGT TCGGCACGGC CGTGCCCACG CCCGCGCGGA 
62581 CGGCACGGAC CCACCACGAC GAGCCACTCG CGATCGTCGG CATGGCGTGC CGACTGCCCG 
62641 GCGGGGTCGC CTCGCCGGAG GACCTGTGGC AGCTCGTGGC GTCCGGCACC GACGCGATCA 
627 01 CCGAGTTCCC CACCGACCGC GGCTGGGACA TCGACCGGCT GTTCGACCCG GACCCGGACG 
627 61 CCCCCGGCAA GACCTACGTC CGGCACGGCG GCTTCCTCGC CGAGGCCGCC GGCTTCGATG 
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62821 CCGCGTTCTT CGGCATCAGC CCGCGCGAGG CACGGGCCAT GGACCCGCAG CAGCGCGTCA 
62881 TCCTCGAAAC CTCCTGGGAG GCGTTCGAGA ACGCGGGCAT CGTGCCGGAC ACGCTGCGCG 
62941 GCAGCGACAC CGGCGTGTTC ATGGGCGCGT TCTCCCATGG GTACGGCGCC GGCGTCGACC 
63001 TGGGCGGGTT CGGCGCCACC GCCACGCAGA ACAGCGTGCT CTCCGGCCGG TTGTCGTACT 
5 63061 TCTTCGGCAT GGAGGGCCCG GCCGTCACCG TCGACACCGC CTGCTCGTCG TCGCTGGTCG 
63121 CCCTGCACCA GGCGGCACAG GCGCTGCGGA CTGGAGAATG CTCGCTGGCG CTCGCCGGCG 
63181 GTGTCACGGT GATGCCCACC CCGCTGGGCT ACGTCGAGTT CTGCCGCCAG CGGGGACTCG 
63241 CCCCCGACGG CCGTTGCCAG GCCTTCGCGG AAGGCGCCGA CGGCACGAGC TTCTCGGAGG 
63301 GCGCCGGCGT TCTTGTGCTG GAGCGGCTCT CCGACGCCGA GCGCAACGGA CACACCGTCC 
10 63361 TCGCGGTCGT CCGCTCCTCC GCCGTCAACC AGGACGGCGC CTCCAACGGC ATCTCCGCAC 
63421 CCAACGGCCC CTCCCAGCAG CGCGTCATCC GCCAGGCCCT CGACAAGGCC GGGCTCGCCC 
634 81 CCGCCGACGT GGACGTGGTG GAGGCCCACG GCACCGGAAC CCCGCTGGGC GACCCGATCG 
63541 AGGCACAGGC CATCATCGCG ACCTACGGCC AGGACCGCGA CACACCGCTC TACCTCGGTT 
63601 CGGTCAAGTC GAACATCGGA CACACCCAGA CCACCGCCGG TGTCGCCGGC GTCATCAAGA 
15 63661 TGGTCATGGC GATGCGCCAC GGCATCGCGC CGAAGACACT GCACGTGGAC GAGCCGTCGT 
63721 CGCATGTGGA CTGGACCGAG GGTGCGGTGG AACTGCTCAC CGAGGCGAGG CCGTGGCCCG 
637 81 ACGCGGGACG CCCGCGCCGC GCGGGCGTGT CGTCGCTCGG TATCAGCGGT ACGAACGCCC 
63841 ACGTGATCCT TGAGGGTGTT CCCGGGCCGT CGCGTGTGGA GCCGTCTGTT GACGGGTTGG 
□ 63901 TGCCGTTGCC GGTGTCGGCT CGGAGTGAGG CGAGTCTGCG GGGGCAGGTG GAGCGGCTGG 

yO 20 63961 AGGGGTATCT GCGCGGGAGT GTGGATGTGG CCGCGGTCGC GCAGGGGTTG GTGCGTGAGC 
*0 64 021 GTGCTGTCTT CGGTCACCGT GCGGTACTGC TGGGTGATGC CCGGGTGATG GGTGTGGCGG 

J£ 64 081 TGGATCAGCC GCGTACGGTG TTCGTCTTTC CCGGGCAGGG TGCTCAGTGG GTGGGCATGG 

P 64141 GTGTGGAGTT GATGGACCGT TCTGCGGTGT TCGCGGCTCG TAT G GAG GAG TGTGCGCGGG 

y 64 201 CGTTGTTGCC GCACACGGGC TGGGATGTGC GGGAGATGTT GGCGCGGCCG GATGTGGCGG 

i - 25 64261 AGCGGGTGGA 'GGTGGTCCAG CCGGCCAGCT GGGCGGTCGC GGTCAGCCTG GCCGCACTGT 
^ 64321 GGCAGGCCCA CGGGGTCGTA CCCGACGCGG TGATCGGACA CTCCCAGGGC GAGATCGCGG 

64 381 CGGCGTGCGT GGCCGGGGCC CTCAGCCTTG AGGACGCCGC CCGCGTGGTG GCCTTGCGCA 
^ 644 41 GCCAGGTCAT CGCGGCGCGA CTGGCCGGGC GGGGAGCGAT GGCTTCGGTG GCATTGCCGG 

U 64 501 CCGGTGAGGT CGGTCTGGTC GAGGGCGTGT GGATCGCGGC GCGTAACGGC CCCGCCTCGA 

09 30 64 561 CAGTCGTGGC CGGCGAGCCG TCGGCGGTGG AGGACGTGGT GACGCGGTAT GAGACCGAAG 
fU 64 621 GCGTGCGAGT GCGTCGTATC GCCGTCGACT ACGCCTCCCA CACGCCCCAC GTGGAAGCCA 

64 681 TCGAGGACGA ACTCGCTGAG GTACTGAAGG GAGTTGCAGG GAAGGCCGCG TCGGTGGCGT 
647 41 GGTGGTCGAC CGTGGACAGC GCCTGGGTGA CCGAGCCGGT GGATGAGAGT TACTGGTACC 
64 801 GGAACCTGCG TCGCCCCGTC GCGCTGGACG CGGCGGTGGC GGAGCTGGAC GGGTCCGTGT 
35 64 8 61 TCGTGGAGTG CAGCGCCCAT CCGGTGCTGC TGCCGGCGAT GGAACAGGCC CACACGGTGG 
64 921 CGTCGTTGCG CACCGGTGAC GGCGGCTGGG AGCGATGGCT GACGGCGTTG GCGCAGGCGT 
64 981 GGACCCTGGG CGCGGCAGTG GACTGGGACA CGGTGGTCGA ACCGGTGCCA GGGCGGCTGC 
65041 TCGATCTGCC CACCTACGCG TTCGAGCGCC GGCGCTACTG GCTGGAAGCG GCCGGTGCCA 
65101 CCGACCTGTC CGCGGCCGGG CTGACAGGGG CAGCACATCC CATGCTGGCC GCCATCACGG 
40 65161 CACTACCCGC CGACGACGGT GGTGTTGTTC TCACCGGCCG GATCTCGTTG CGCACGCATC 
65221 CCTGGCTGGC TGATCACGCG GTGCGGGGCA CGGTCCTGCT GCCGGGCACG GCCTTTGTGG 
65281 AGCTGGTCAT CCGGGCCGGT GACGAGACCG GTTGCGGGAT AGTGGATGAA CTGGTCATCG 
65341 AATCCCCCCT CGTGGTGCCG GCGACCGCAG CCGTGGATCT GTCGGTGACC GTGGAAGGAG 
65401 CTGACGAGGC CGGACGGCGG CGAGTGACCG TCCACGCCCG CACCGAAGGC ACCGGCAGCT 
45 654 61 GGACCCGGCA CGCCAGCGGC ACCCTGACCC CCGACACCCC CGACACCCCC AACGCTTCCG 
65521 GTGTTGTCGG TGCGGAGCCG TTCTCGCAGT GGCCACCTGC CACTGCCGCG GCCGTCGACA 
65581 CCTCGGAGTT CTACTTGCGC CTGGACGCGC TGGGCTACCG GTTCGGACCC ATGTTCCGCG 
65641 GAATGCGGGC TGCCTGGCGT GATGGTGACA CCGTGTACGC CGAGGTCGCG CTCCCCGAGG 
65701 ACCGTGCCGC CGACGCGGAC GGTTTCGGCA TGCACCCGGC GCTGCTCGAC GCGGCCTTGC 
50 657 61 AGAGCGGCAG CCTGCTCATG CTGGAATCGG ACGGCGAGCA GAGCGTGCAA CTGCCGTTCT 
65821 CCTGGCACGG CGTCCGGTTC CACGCGACGG GCGCGACCAT GCTGCGGGTG GCGGTCGTAC 
65881 CGGGCCCGGA CGGCCTCCGG CTGCATGCCG CGGACAGCGG GAACCGTCCC GTCGCGACGA 
65941 TCGACGCGCT CGTGACCCGG TCCCCGGAAG CGGACCTCGC GCCCGCCGAT CCGATGCTGC 
66001 GGGTCGGGTG GGCCCCGGTG CCCGTACCTG CCGGGGCCGG TCCGTCCGAC GCGGACGTGC 
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66061 TGACGCTGCG CGGCGACGAC GCCGACCCGC TCGGGGAGAC CCGGGACCTG ACCACCCGTG 
66121 TTCTCGACGC GCTGCTCCGG GCCGACCGGC GGGTGATCTT CCAGGTGACC GGTGGCCTCG 
66181 CCGCCAAGGC GGCCGCAGGC CTGGTCCGCA CCGCTCAGAA CGAGCAGCCC GGCCGCTTCT 
66241 TCCTCGTCGA AACGGACCCG GGAGAGGTCC TGGACGGCGC GAAGCGCGAC GCGATCGCGG 
5 66301 CACTCGGCGA GCCCCATGTG CGGCTGCGCG ACGGCCTCTT CGAGGCAGCC CGGCTGATGC 
66361 GGGCCACGCC GTCCCTGACG CTCCCGGACA CCGGGTCGTG GCAGCTGCGG CCGTCCGCCA 
66421 CCGGTTCCCT CGACGACCTT GCCGTCGTCC CCACCGACGC CCCGGACCGG CCGCTCGCGG 
66481 CCGGCGAGGT GCGGATCGCG GTACGCGCGG CGGGCCTGAA CTTCCGGGAT GTCACGGTCG 
66541 CGCTCGGTGT GGTCGCCGAT GCGCGTCCGC TCGGCAGCGA GGCCGCGGGT GTCGTCCTGG 
10 66601 AGACCGGCCC CGGTGTGCAC GACCTGGCGC CCGGCGACCG GGTCCTGGGG ATGCTCGCGG 
66661 GCGCCTTCGG ACCGGTCGCG ATCACCGACC GGCGGCTGCT CGGCCGGATG CCGGACGGCT 
66721 GGACGTTCCC GCAGGCGGCG TCCGTGATGA CCGCGTTCGC GACCGCGTGG TACGGCCTGG 
66781 TCGACCTGGC CGGGCTGCGC CCCGGCGAGA AGGTCCTGAT CCACGCGGCG GCGACCGGTG 
66841 TCGGCGCGGC GGCCGTCCAG ATCGCGCGGC ATCTGGGCGC GGAGGTGTAC GCGACCACCA 
15 66901 GCGCCGCGAA GCGCCATCTG GTGGACCTGG ACGGAGCGCA TCTGGCCGAT TCCCGCAGCA 
66961 CCGCGTTCGC CGACGCGTTC CCGCCGGTCG ATGTCGTGCT CAACTCGCTC ACCGGTGAAT 
67021 TCCTCGACGC GTCCGTCGGC CTGCTCGCGG CGGGTGGCCG GTTCATCGAG ATGGGGAAGA 
_ 67081 CGGACATCCG GCACGCCGTC CAGCAGCCGT TCGACCTGAT GGACGCCGGC CCCGACCGGA 

p 67141 TGCAGCGGAT CATCGTCGAG CTGCTCGGCC TGTTCGCGCG CGACGTGCTG CACCCGCTGC 

yp 20 67201 CGGTCCACGC CTGGGACGTG CGGCAGGCGC GGGAGGCGTT CGGCTGGATG AGCAGCGGGC 
gQ 67261 GTCACACCGG CAAGCTGGTG CTGACGGTCC CGCGGCCGCT GGATCCCGAG GGGGCCGTCG 

^ 67321 TCATCACCGG CGGCTCCGGC ACCCTCGCCG GCATCCTCGC CCGCCACCTG GGCCACCCCC 

Q 67381 ACACCTACCT GCTCTCCCGC ACCCCACCCC CCGACACCAC CCCCGGCACC CACCTCCCCT 

jf. i 67441 GCGACGTCGG CGACCCCCAC CAACTCGCCA CCACCCTCGC CCGCATCCCC CAACCCCTCA 

25 67501 CCGCCGTCTT CCACACCGCC . GGAACCCTCG ACGACGCCCT GCTCGACAAC CTCACCCCCG 
^ 67 561 ACCGCGTCGA CACCGTCCTC AAACCCAAGG CCGACGCCGC CTGGCACCTG CACCGGCTCA 

67 621 CCCGCGACAC CGACCTCGCC GCGTTCGTCG TCTACTCCGC GGTCGCCGGC CTCATGGGCA 

67 681 GCCCGGGGCA GGGCAACTAC GTCGCGGCGA ACGCGTTCCT CGACGCGCTC GCCGAACACC 
O 67741 GCCGTGCGCA AGGGCTGCCC GCGCAGTCCC TCGCATGGGG CATGTGGGCG GACGTCAGCG 
03 30 67 801 CGCTCACCGC GAAACTCACC GACGCGGACC GCCAGCGCAT CCGGCGCAGC GGATTCCCGC 
fy 678 61 CGTTGAGCGC CGCGGACGGC ATGCGGCTGT TCGACGCGGC GACGCGTACC CCGGAACCGG 
Sj 67 921 TCGTCGTCGC GACGACCGTC GACCTCACCC AGCTCGACGG CGCCGTCGCG CCGTTGCTCC 
Q 67 981 GCGGTCTGGC CGCGCACCGG GCCGGGCCGG CGCGCACGGT CGCCCGCAAC GCCGGCGAAG 

68041 AGCCCCTGGC CGTGCGTCTT GCCGGGCGTA CCGCCGCCGA GCAGCGGCGC ATCATGCAGG 
35 68101 AGGTCGTGCT CCGCCACGCG GCCGCGGTCC TCGCGTACGG GCTGGGCGAC CGCGTGGCGG 
68161 CGGACCGTCC GTTCCGCGAG CTCGGTTTCG ATTCGCTGAC CGCGGTCGAC CTGCGCAATC 
68221 GGCTCGCGGC CGAGACGGGG CTGCGGCTGC CGACGACGCT GGTGTTCAGC CACCCGACGG 
68281 CGGAGGCGCT CACCGCCCAC CTGCTCGACC TGATCGACGC TCCCACCGCC CGGATCGCCG 
68341 GGGAGTCCCT GCCCGCGGTG ACGGCCGCTC CCGTGGCGGC CGCGCGGGAC CAGGACGAGC 
40 68401 CGATCGCCAT CGTGGCGATG GCGTGCCGGC TGCCCGGTGG TGTGACGTCG CCCGAGGACC 
684 61 TGTGGCGGCT CGTCGAGTCC GGCACCGACG CGATCACCAC GCCTCCTGAC GACCGCGGCT 
68521 GGGACGTCGA CGCGCTGTAC GACGCGGACC CGGACGCGGC CGGCAAGGCG TACAACCTGC 
68581 GGGGCGGTTA CCTGGCCGGG GCGGCGGAGT TCGACGCGGC GTTCTTCGAC ATCAGTCCGC 

68 641 GCGAAGCGCT CGGCATGGAC CCGCAGCAAC GCCTGCTGCT CGAAACGGCG TGGGAGGCGA 
45 68701 TCGAGCGCGG CCGGATCAGT CCGGCGTCGC TCCGCGGCCG GGAGGTCGGC GTCTATGTCG 

687 61 GTGCGGCCGC GCAGGGCTAC GGGCTGGGCG CCGAGGACAC CGAGGGCCAC GCGATCACCG 
68821 GTGGTTCCAC GAGCCTGCTG TCCGGACGGC TGGCGTACGT GCTCGGGCTG GAGGGCCCGG 
68881 CGGTCACCGT GGACACGGCG TGCTCGTCGT CTCTGGTCGC GCTGCATCTG GCGTGCCAGG 
68941 GGCTGCGCCT GGGCGAGTGC GAACTCGCTC TGGCCGGAGG GGTCTCCGTA CTGAGTTCGC 
50 69001 CGGCCGCGTT CGTGGAGTTC TCCCGCCAGC GCGGGCTCGC GGCCGACGGG CGCTGCAAGT 
69061 CGTTCGGCGC GGGCGCGGAC GGCACGACGT GGTCCGAGGG CGTGGGCGTG CTCGTACTGG 
69121 AACGGCTCTC CGACGCCGAG CGGCTCGGGC ACACCGTGCT CGCCGTCGTC CGCGGCAGCG 
69181 CCGTCACGTC CGACGGCGCC TCCAACGGCC TCACCGCGCC GAACGGGCTC TCGCAGCAGC 
69241 GGGTCATCCG GAAGGCGCTC GCCGCGGCCG GGCTGACCGG CGCCGACGTG GACGTCGTCG 
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AGGGGCACGG 

CGTACGGGCA 

ATGCCACGGC 

GCACGATGCC 

GACAGGTGTC 

CGGCCGTCTC 

GTCCGGCGCC 

CGTGGGTGCT 

ACCACCTCGC 

GCCGCGCCCA 

CCGCGCTCGA 

AGGAGCGGCG 

GCGAGCTCCA 

TCGGCAAGCA 

CCCATGACAC 

TGCTGGAGCA 

CCGCGGCGTA 

GGGGGCGGGC 

CGGAGGTCGG 

TGCTCGCCGG 

GGCGCACGAA 

TCGACGGCTT 

TGTCCACGAC 

GCCATGCGCG 

TCACCACGTT 

CCGGGGAGGA 

CGGCGCTGAC 

TACTGGCCGG 

GGCTGGCCCC 

AGTCCGAGCC 

TCGGCGTCAC 

ACTCACTGGC 

CGGCGGCCGT 

GGATCGAGGC 

TCTCGCTCCT 

CGGAGCGTGC 

GATGAGCACC 

GGACGGTCAC 

CAAGCACTGG 

CAGCTCGGCC 

GGACTCACCG 

GGCGCGCAAG 

GGCCGCGGGA 

CATCAACGCG 

CGACATCACC 

GCACGCGCTG 

GCTGGCCTCG 

CGCGACGCTG 

CGCACTGCTC 

CAACGCGGTC 

CTGTGTCGAG 

GCTCTACTCG 

GACGCGCCCG 

GCACATCGCC 



CACCGGCACC 

GGACCGTCCG 

CGCGGCCGGT 

GCGGACGCTG 

CCTGCTCGGC 

CGCGTTCGGG 

CGTGGCGTCC 

CTCCGCGCGG 

GGCGGCACCG 

GTTCGCCCAC 

CGGCCTCGCG 

CGTCGCCTTC 

CCGCCGGTTC 

CCTCAAGCAC 

CCTGTACGCC 

CTGGGGGGTG 

CGCGGCGGGG 

GCTGCGGGCG 

CGCCCGCACG 

TTCGCCGGAC 

ACGGCTCGAC 

CCGTACGGTG 

GACGGGCCGG 

TCGGCCGGTG 

CGTGGCCGTC 

CGCCGGGACC 

CGCCCTCGCC 

TGGCCGGCCA 

GGCCGTGGCG 

GGAGGACCTC 

GGACCCCGCC 

GGTGCAGCGG 

CCTGTTCGAC 

CGGCCAGGAC 

GGAGGAGATG 

GGCCATCGCC 

GATACGCACG 

CGCGCCATCC 

CTGGTCGCCG 

GCGCCGTCCG 

GAGCACAACC 

CGGGAGGACT 

CCCGGCACCG 

CTGTACGGGC 

GGCTCGGCCG 

CGGCTGGTCC 

GCCGACGACG 

CTGTTCGCCG 

AGCCACCCCG 

GAGGAGATGC 

GACGTCGATG 

ACGGCCAACC 

CTGGAGGGCA 

CGGGTGCTCA 



CGGCTCGGC'G 

GCACCGGTCT 

GTCGCGGGCG 

CATGTGGAGG 

TCCAACCGGC 

CTCAGCGGGA 

CAGCCGCCCC 

ACTCCGGCCG 

GACGCGGATC 

CGTGCCGCGG 

GACGGCGCGG 

CTCTTCGACG 

CCCGTCTTCG 

TCCCCCACGG 

CAGGCCGGCC 

CGGCCGGACG 

GTGCTCACCC 

CTGCCGCCCG 

GATCTGGACA 

GATGTGGCGG 

GTCGGGCACG 

CTGGAGTCGC 

GACGCCGCGG 

CTGTTCTCGG 

GGCCCCTCCG 

TACCACGCGG 

GAGCTGCACG 

GTGGACCTTC 

GGGGCGCCGG 

ACCGTCGCCG 

GACGTCGATG 

CTGCGCAACC 

CACGACACCC 

CGGATCGAGG 

GAGTCGCTCG 

GATCTGCTCG 

AGGGAACGCC 

TGGAGAGCGG 

CCGCCGAGGA 

AGATGCTGCC 

GCTACCGGCA 

TCGTCGCCGA 

ACCTCATCCC 

TCACCCCTGA 

ATCTGGACAG 

GCGCGAAGCG 

GCGAGATCTC 

GCCACGACTC 

AGCAGCAGGC 

TCCGTTTCCT 

TGCGGGGCGT 

GCGACCCCGA 

ACTTCGCGTT 

TCAAGGTCGC 



ACCCGGTCGA 

GGCTGGGCTC 

TCATCAAGAT 

AGCCCTCGCC 

CCTGGCCGGA 

CGAACGCGCA 

GGCCGCCCCG 

CGCTGCGGGC 

CGTTGGACAT 

TCGTCGCCAC 

AGGCGCCCGG 

GCCAGGGCGC 

CCGCCGCGTG 

ACGTCTACCA 

TGTTCACGCT 

TGCTCGTCGG 

TGGCGGACGC 

GGGCGATGCT 

TCGCCGCGGT 

CGTTCGAACG 

CGTTCCACTC 

TCGCGTTCGG 

ACGACCTCAT 

ATGCCGTCCG 

GCTCCCTGGC 

TGCTGCGCGC 

CCCACGGCGT 

CCGTGTACGC 

CCACCGTGGC 

AGATCGTCCG 

CGGAAGCGAC 

AGCTCGCCTC 

CGGCCGCGCT 

CCGGCGAGGA 

ACGCCGCGGA 

ACAAGCTCGC 

GCCCGCCGGC 

CACGGTGGGT 

CGTCAAGCTG 

CGACCGGCGG 

GAAGATCGCG 

GGCCGCCGAC 

CGGGTACGCC 

GGAGGGGGCC 

CGTCAAGACG 

TGACGAGCGG 

GCTCAGCGAC 

GGTGCAGCAG 

GGCGCTGCGC 

GCCCGTCAAC 

GCGCATCCGT 

GGTGTTCCCG 

CGGCCACGGC 

CTGCCTGCGG 



GGCGGACGCG 

GCTGAAGTCG 

GGTGCAGGCG 

CGCCGTCGAC 

CGACGAGCGT 

CGTCATCCTG 

TGAGGAGTCC 

CCAGGCGGCC 

CGGGTACGCG 

CACCCCGGAC 

AGTCGTCACC 

CCAGCGCGCC 

GGACGAGGTC 

CGGCGAACAC 

CGAAGTGGCG 

GCACTCCGTC 

GACGGAGTTG 

CGCCGTCGAC 

CAACGGCCCG 

GGAGTGGTCG 

CCGGCACGTC 

CGCGGCGCGG 

AACGCCCGCG 

GGAGCTGGCC 

GTCGGCCGCG 

CCGGACCGGT 

CCCGGTCGAC 

GTTCCAGCAC 

GGACACCGGG 

TCGGCGCACC 

GTTCTTCGCG 

GGCAACCGGG 

CACCGCGTTC 

CGACGACGCG 

CATCGCGGCG 

CCATACCTGG 

CGCTGCCCAT 

TCGTTCGACC 

GTCACCAACG 

CCCGGCTGGT 

GGGGACTTCA 

GCCTGCCTGG 

AAGCGGCTGC 

GTGCTGGAGG 

CTGACCGACG 

GGCGAGGACC 

GACGAGGCGA 

ATGGTCGGCT 

GCGCGCCCGG 

CAGATGGGCG 

GCGGGCGACA 

CAGCCCGACA 

ATTCACAAGT 

TTGTTCGAGC 



CTGCTCGCGA 

AACATCGGAC 

ATCGGCGCGG 

TGGAGCACCG 

CCGCGCCGGG 

GAACAGCACC 

CAGCCGCTGC 

CGGCTGCGCG 

CTGGCCACCA 

GGATTCCGTG 

GGGACCGCTC 

GGAATGGGGC 

TCCGACGCGT 

GGCGCTCTCG 

CTGCTGCGGC 

GGCGAGGTGA 

ATCGTGGCCC 

GGAAGCCCGG 

TCCGCCGTGG 

GCGGCCGGGC 

GACGGTGCGC 

CTGCCGGTGG 

CACTGGCTGC 

GACCGCGGCG 

GCGGAGAGCG 

GAGGAGACCG 

CTGGCCGCGG 

CGTTCCTACT 

GGTCCGGCGG 

GCGGCGCTGC 

CTCGGTTTCG 

CTGGACCTGC 

CTCCAGGACC 

CCCACCGTGC 

ACGCCGGCCC 

AAGGACTACC 

TCGCGATCCA 

TGTTCGGCGT 

ATCCGCGGTT 

TCTCCGGGAT 

CACTGCGCGC 

ACGACATCGA 

CCTCCCTCGT 

CACGGATGCG 

ACTTCTTCGG 

TGCTGCACCG 

CGGGCGTGTT 

ACTGCCTCTA 

AGCTGGTCGA 

TACCGCGCGT 

ACGTGATCCC 

CCTTCGATGT 

GTCCCGGCCA 

GTTTCCCGGA 
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72541 CGTCCGGCTG GCCGGCGACG TGCCGATGAA CGAGGGGCTC GGGCTGTTCA GCCCGGCCGA 
■ 72 601 GCTGCGGGTC ACCTGGGGGG CGGCATGAGT CACCCGGTGG AGACGTTGCG GTTGCCGAAC 
72 661 GGGACGACGG TCGCGCACAT CAACGCGGGC GAGGCGCAGT TCCTCTACCG GGAGATCTTC 
72721 ACCCAGCGCT GCTACCTGCG CCACGGTGTC GACCTGCGCC CGGGGGACGT GGTGTTCGAC 
5 727 81 GTCGGCGCGA ACATCGGCAT GTTCACGCTT TTCGCGCATC TGGAGTGTCC TGGTGTGACC 
72841 GTGCACGCCT TCGAGCCCGC GCCCGTGCCG TTCGCGGCGC TGCGGGCGAA CGTGACGCGG 
72901 CACGGCATCC CGGGCCAGGC GGACCAGTGC GCGGTCTCCG ACAGCTCCGG CACCCGGAAG 
72961 ATGACCTTCT ATCCCGACGC CACGCTGATG TCCGGTTTCC ACGCGGATGC CGCGGCCCGG 
73021 ACGGAGCTGT TGCGCACGCT CGGCCTCAAC GGCGGCTACA CCGCCGAGGA CGTCGACACC 
10 73081 ATGCTCGCGC AACTGCCCGA CGTCAGCGAG GAGATCGAAA CCCCTGTGGT CCGGCTCTCC 
7 3141 GACGTCATCG CGGAGCGCGG TATCGAGGCC ATCGGCCTGC TGAAGGTCGA CGTGGAGAAG 
73201 AGCGAACGGC AGGTCTTCGC CGGCCTCGAG GACACCGACT GGCCCCGTAT CCGCCAGGTC 
7 3261 GTCGCGGAGG TCCACGACAT CGACGGCGCG CTCGAGGAGG TCGTCACGCT GCTCCGCGGC 
73321 CATGGCTTCA CCGTGGTCGC CGAGCAGGAA CCGCTGTTCG CCGGCACGGG CATCCACCAG 
15 73381 GTCGCCGCGC GGCGGGTGGC CGGCTGAGCG CCGTCGGGGC CGCGGCCGTC CGCACCGGCG 
73441 GCCGCGGTGC GGACGGCGGC TCAGCCGGCG TCGGACAGTT CCTTGGGCAG TTGCTGACGG 
73501 CCCTTCACCC CCAGCTTGCG GAACACGTTG GTGAGGTGCT GTTCCACCGT GCTGGAGGTG 
73561 ACGAACAGCT GGCTGGCGAT CTCCTTGTTG GTGCGCCCGA CCGCGGCGTG CGACGCCACC 
Q 73621 CGCCGCTCCG CCTCGGTCAG CGATGTGATC CGCTGCGCCG GCGTCACGTC CTGGGTGCCG 

^ 20 7 3681 TCCGCGTCCG AGGACTCCCC ACCGAGCCGC CGGAG GAG CG GCACGGCTCC GCACTGGGTC 
,fi 73741 GCGAGGTGCC GTGCGCGGCG GAACAGTCCC CGCGCACGGC TGTGCCGCCG GAGCATGCCG 

P 73801 CACGCTTCGC CCATGTCGGC GAGGACGCGG GCCAGCTCGT ACTGGTCGCG GCACATGATG 

73861 AGCAGATCGG CGGCCTCGTC GAGCAGTTCG ATCCGCTTGG CCGGCGGACT GTAGGCCGCC 
^ 73921 TGCACCCGCA GCGTCATCAC CCGCGCCCGG GACCCCATCG GCCGGGACAG CTGCTCGGAG 

W 25 73981 ATGAGCCTCA GCCCCTCGTC ACGGCCGCGG CCGAGCAGCA GAAGCGCTTC GGCGGCGTCG 
H 74 041 ACCCGCCACA GGGCCAGGCC CGGCACGTCG ACGGACCAGC GTCGCATCCG CTCCCCGCAG 

01 74101 TCCCGGAACG CGTTGTACGC CGCCCGGTAC CGCCCGGCCG CGAGATGGTG TTGCCCACGG 

3 74161 GCCCAGACCA TGTGCAGTCC GAAGAGGCTG TCGGAGGTCT CCTCCGGCAA CGGCTCGGCG 

Q 74 221 AGCCACCGCT CCGCCCGGTC CAGGTCGCCC AGTCGGATCG CGGCGGCCAC GGTGCTGCTC 

30 74 281 AGCGGCAATG CGGCGGCCAT CCCCCAGGAG GGCACGACCC GGGGGGCGAG CGCGGCCTCG 
74 341 CCGCATTCGA CGGCGGCGGT CAGGTCGCCG CGGCGCAGCG CGGCCTCGGC GCGGAACCCC 
74 4 01 GCGTGGACCG CCTCGTCGGC CGGGGTCCGC ATGTTGTCGT CACCGGCCAG CTTGTCGACC 
74 4 61 CAGGACTGGA CGGCATCGGT GTCCTCGGCG TAGAGCAGGG CCAGCAACGC CATCATGGTC 
74 521 GTGGTCCGGT CCGTCGTGAC CCGGGAGTGC TGGAGCACGT ACTCGGCTTT GGCCTCGGCC 
35 74 581 TGTTCGGACC AGCCGCGCAG CGCGTTGCTC AGGGCCTTGT CGGCGACGGC GCGGTGCCGG 
74 641 ACGGCTCCGG . AAAACGAGGC GACCTCGTCC TCGGCCGGCG GATCGGCCGG ACGCGGCGGA 
74 701 TCGGCCGCGC CGGGATAGAT CAGCGCGAGG GACAGGTCCG CGACGCGCAG GTGCGCCCGG 
74 7 61 CCCTGCTCGC TCGGGGCGGC GGAGCGCTGG GCCGCCAGGA CCTCGGCGGC CTCGCCCGGC 
7 4 821 CGCCCGTCCA TCGCCAGCCA GCAGGCGAGC GACACGGCGT GCTCGCTGGA GAGGAGCCGT 
40 74 881 TCCCGCGACG CGGTGAGCAG CTCGGGCACA TGCCGGCCGG ATCTGGCGGG ATCGCAGAGC 

74 941 CGCTCGATGG CGGCGGTGTC GACGCGCAGT GCGGCGTGGA CGGCGGGGTC GTCGGAGGCC 
75001 CGGTAGGCGA ACTCCAGGTA GGTGACGGCC TCGTCGAGCT CGCCGCGCAG GTGGTGCTCG 
7 5061 CGCGCGGCGT CGGTGAACAG CCCGGCGACC TCGGCGCCGT GCACCCGGCC GGTACCCATC 
75121 TGGTGGCGGG CGAGCACCTT GCTGGCCACG CCGCGGTCCC GCAGCAGTTC CAGCGCCAGC 

45 75181 TCGTGCAGGC CACGCCGCTC GGCGGCGGAG AGGTCGTCGA GTACGACGGA GCGGGCCGCG 
75241 GGGTGCGGGA ACCGCCCTTC CCGCAGCAGC CGCCCCTCGA CCAGCTGTTC GTGGGCCTGC 
7 5301 TCGACCGCCT CGGTGTCGAG GCCGGTCATC CGCTGGACGA GGGTGAGTTC GACACTCTCG 
7 5361 CCGAGCACGG GGGAAGCTCG GGCGACGCTC AGCGCGGCCG GGCCGCAACG ATAGAGCGAC 
7 5421 CCGAGGTAGG CGAGCCGGTA CGCCCGCCCC GCGACCACTT CCAGGCACCC TGAGGTCCGT 
50 7 5481 GTCCGTGCCT CCCGGATGTC GTCGATCAGG CCGTGGCCGA GGAGCAGGTT GCCGCCGGTC 
7 5541 GCCCGGAACG CCTGGGCCAC CACGTCGTCG TGCGCGTCCT GGCCGAGGTG CCGGCGCACG 

75 601 AGTTCGGTGG TCTGCGCCTC GGTGAGCGGG CGCAGCGCGA TCTCCTGGTA GTGGCGCAGA 
7 5661 CTCAGCAGTG CCGCCCGGAA TTGGGAGTGG GCGGGCGTCG GCCGGAGCAG CTCGGTCAGC 
7 5721 ACGATGGCGA CACGGGCCCG GCTGATGCGG CGCGCGAGGT GGAGCAGGCA GCGCAGCGAC 
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7 5781 GGCGCGTCGG CGTGGTGCAC GTCGTCGATG CCGATCAGTA CGGGCCGCTC CGCGGCGAGC 

7 5841 GTCAGCACCG TGCGGGTGAG TTCGGTCCCC AGGCGGTTGT CGACGTCGGC CGGCAGGTTT 

7 5901 TCGCACGATG CCGTCAGCCG GACCAGCTCC GGTGTCCGGG CGGCCAGCTC GGGCTGGTCG 

75961 AGGAGCTGGC CGAGCATGCC GTACGGCAGG GCCCGCTCCT CCATGGAGCA CACCGCGCGA 

5 7 6021 AGGGTGACGA AGCCGGCCTT GGCCGCGGCG GCGTCGAGGA GTTCGGTCTT GCCGCAGGCG 

7 6081 ATCGGCCCGG TGACGGCGGC GACGACGCCC CGCCCGCCCC CCGCTCGGGT GAGCGCCCGG 

7 6141 TGGAGGGAAC CGAACTCGTC ATCGCGGGCG ATCAGGTCTG GG G GAG AT AA GCGCGCTATC 

7 6201 ACGAATGGAA CTACCTCGCG ACCGTCGTGG AAACCCATAG GCATCACATG GCTTGTTGAT 

7 6261 CTGTACGGCT GTGATTCAGC CTGGCGGGAT GCTGTGCTAC AGATGGGAAG ATGTGATCTA 

10 7 6321 GGGCCGTGCC GTTCCCTCAG GAGCCGACCG CCCCCGGCGC CACCCGCCGT ACCCCCTGGG 

7 6381 CCACCAGCTC GGCGACCCGC TCCTGGTGGT CGACGAGGTA GAAGTGCCCG CCGGGGAAGA 

7 6441 CCTCCACCGT GGTCGGCGCG GTCGTGTGCC CGGCCCAGGC GTGGGCCTGC TCCACCGTCG 

76501 TCTTCGGATC GTCGTCACCG ATGCACACCG TGATCGGCGT CTCCAGCGGC GGCGCGGGCT 

7 6561 CCCACCGGTA CGTCTCCGCC GCGTAGTAGT CCGCCCGCAA CGGCGCCAGG ATCAGCGCGC 

15 7 6621 GCATTTCGTC GTCCGCCATC ACATCGGCGC TCGTCCCGCC GAGGCCGATG ACCGCCGCCA 

7 6681 GCAGCTCGTC GTCGGACGCG AGGTGGTCCT GGTCGGCGCG CGGCTGCGAC GGCGCCCGCC 

7 6741 GGCCCGAGAC GATCAGGTGC GCCACCGGGA GCCGCTGGGC CAGCTCGAAC GCGAGTGTCG 

7 6801 CGCCCATGCT GTGGCCGAAC AGCACCAGCG GACGGTCCAG CCCCGGCTTC AACGCCTCGG 

Q 7 6861 CCACGAGGCC GGCGAGAACA CGCAGGTCGC GCACCGCCTC CTCGTCGCGG CGGTCCTGGC 

yg 20 7 6921 GGCCGGGGTA CTGCACGGCG TACACGTCCG CCACCGGGGC GAGCGCACGG GCCAGCGGAA 

j3 7 6981 GGTAGAACGT CGCCGATCCG CCGGCGTGGG GCAGCAGCAC CACCCGTACC GGGGCCTCGG 

7 7 041 GCGTGGGGAA GAACTGCCGC AGCCAGAGTT CCGAGCTCAC CGCACCCCCT CGGCCGCGAC 

77101 CTGGGGAGCC CGGAACCGGG TGATCTCGGC CAAGTGCTTC TCCCGCATCT CCGGGTCGGT 

77161 CACGCCCCAT CCCTCCTCCG GCGCCAGACA GAGGACGCCG ACTTTGCCGT TGTGCACATT 

IM 25 7 7 221 GCGATGCACA TCGCGCACCG CCGACCCGAC GTC.GTCGAGC GGGTAGGTCA CCGACAGCGT 

M» 77281 CGGGTGCACC ATCCCCTTGC AGATCAGGCG GTTCGCCTCC CACGCCTCAC GATAGTTCGC 

51 7 7 341 GAAGTGGGTA CCGATGATCC GCTTCACGGA CATCCACAGG TACCGATTGT CAAAGGCGTG 

3 774 01 CTCGTATCCC GAGGTTGACG CGCAGGTGAC GATCGTGCCA CCCCGACGTG TCACGTAGAC 

Q 77 4 61 ACTCGCGCCG AACGTCGCGC GCCCCGGGTG CTCGAACACG ATGTCGGGAT CGTCACCGCC 

m 30 77 521 GGTCAGCTCC CGGATC 



Those of skill in the art will recognize that, due to the degenerate nature of the 
genetic code, a variety of DNA compounds differing in their nucleotide sequences can be 
used to encode a given amino acid sequence of the invention. The native DNA sequence 

35 encoding the FK-520 PKS of Streptomyces hygroscopicus is shown herein merely to 
illustrate a preferred embodiment of the invention, and the present invention includes 
DNA compounds of any sequence that encode the amino acid sequences of the 
polypeptides and proteins of the invention. In similar fashion, a polypeptide can typically 
tolerate one or more amino acid substitutions, deletions, and insertions in its amino acid 

40 sequence without loss or significant loss of a desired activity. The present invention 
includes such polypeptides with alternate amino acid sequences, and the amino acid 
sequences shown merely illustrate preferred embodiments of the invention. 
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The recombinant nucleic acids, proteins, and peptides of the invention are many 
and diverse. To facilitate an understanding of the invention and the diverse compounds 
and methods provided thereby, the following general description of the FK-520 PKS 
genes and modules of the PKS proteins encoded thereby is provided. This general 
description is followed by a more detailed description of the various domains and 
modules of the FK-520 PKS contained in and encoded by the compounds of the 
invention. In this description, reference to a heterologous PKS refers to any PKS other 
than the FK-520 PKS. Unless otherwise indicated, reference to a PKS includes reference 
to a portion of a PKS. Moreover, reference to a domain, module, or PKS includes 
reference to the nucleic acids encoding the same and vice-versa, because the methods and 
reagents of the invention provide or enable one to prepare proteins and the nucleic acids 
that encode them. 

The FK-520 PKS is composed of three proteins encoded by three genes 
designatedJkbA,JkbB, and JkbC. The JkbA ORF encodes extender modules 7 -10 of the 
PKS. The JkbB ORF encodes the loading module (the CoA ligase) and extender modules 
1 - 4 of the PKS. The JkbC ORF encodes extender modules 5 - 6 of the PKS. The JkbP 
ORF encodes the NRPS that attaches the pipecolic acid and cyclizes the FK-520 
polyketide. 

The loading module of the FK-520 PKS includes a CoA ligase, an ER domain, 
and an ACP domain. The starter building block or unit for FK-520 is believed to be a 
dihydroxycyclohexene carboxylic acid, which is derived from shikimate. The 
recombinant DNA compounds of the invention that encode the loading module of the 
FK-520 PKS and the corresponding polypeptides encoded thereby are useful for a variety 
of methods and in a variety of compounds. In one embodiment, a DNA compound 
comprising a sequence that encodes the FK-520 loading module is inserted into a DNA 
compound that comprises the coding sequence for a heterologous PKS. The resulting 
construct, in which the coding sequence for the loading module of the heterologous PKS 
is replaced by the coding sequence for the FK-520 loading module, provides a novel PKS 
coding sequence. Examples of heterologous PKS coding sequences include the 
rapamycin, FK-506, rifamycin, and avermectin PKS coding sequences. In another 
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embodiment, a DNA compound comprising a sequence that encodes the FK-520 loading 
module is inserted into a DNA compound that comprises the coding sequence for the FK- 
520 PKS or a recombinant FK-520 PKS that produces an FK-520 derivative. 

In another embodiment, a portion of the loading module coding sequence is 
5 utilized in conjunction with a heterologous coding sequence. In this embodiment, the 
invention provides, for example, either replacing the CoA ligase with a different CoA 
ligase, deleting the ER, or replacing the ER with a different ER. In addition, or 
alternatively, the ACP can be replaced by another ACP. In similar fashion, the 
corresponding domains in another loading or extender module can be replaced by one or 
10 more domains of the FK-520 PKS. The resulting heterologous loading module coding 
sequence can be utilized in conjunction with a coding sequence for a PKS that 
<yQ synthesizes FK-520, an FK-520 derivative, or another polyketide. 

gj The first extender module of the FK-520 PKS includes a KS domain, an AT 

W domain specific for methylmalonyl CoA, a DH domain, a KR domain, and an ACP 

01 1 5 domain. The recombinant DNA compounds of the invention that encode the first 
P extender module of the FK-520 PKS and the corresponding polypeptides encoded 

|B thereby are useful for a variety of applications. In one embodiment, a DNA compound 

SJ comprising a sequence that encodes the FK-520 first extender module is inserted into a 

DNA compound that comprises the coding sequence for a heterologous PKS. The 
20 resulting construct, in which the coding sequence for a module of the heterologous PKS 
is either replaced by that for the first extender module of the FK-520 PKS or the latter is 
merely added to coding sequences for modules of the heterologous PKS, provides a novel 
PKS coding sequence. In another embodiment, a DNA compound comprising a sequence 
that encodes the first extender module of the FK-520 PKS is inserted into a DNA 
25 compound that comprises the remainder of the coding sequence for the FK-520 PKS or a 
recombinant FK-520 PKS that produces an FK-520 derivative. 

In another embodiment, all or only a portion of the first extender module coding 
sequence is utilized in conjunction with other PKS coding sequences to create a hybrid 
module. In this embodiment, the invention provides, for example, either replacing the 
30 methylmalonyl CoA specific AT with a malonyl CoA, ethylmalonyl CoA, or 2- 
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hydroxymalonyl CoA specific AT; deleting either the DH or KR or both; replacing the 
DH or KR or both with another DH or KR; and/or inserting an ER. In replacing or 
inserting KR, DH, and ER domains, it is often beneficial to replace the existing KR, DH, 
and ER domains with the complete set of domains desired from another module. Thus, if 
one desires to insert an ER domain, one may simply replace the existing KR and DH 
domains with a KR, DH, and ER set of domains from a module containing such domains. 
In addition, the KS and/or ACP can be replaced with another KS and/or ACP. In each of 
these replacements or insertions, the heterologous KS, AT, DH, KR, ER, or ACP coding 
sequence can originate from a coding sequence for another module of the FK-520 PKS, 
from a gene for a PKS that produces a polyketide other than FK-520, or from chemical 
synthesis. The resulting heterologous first extender module coding sequence can be 
utilized in conjunction with a coding sequence for a PKS that synthesizes FK-520, an 
FK-520 derivative, or another polyketide. In similar fashion, the corresponding domains 
in a module of a heterologous PKS can be replaced by one or more domains of the first 
extender module of the FK-520 PKS. 

In an illustrative embodiment of this aspect of the invention, the invention 
provides recombinant PKSs and recombinant DNA compounds and vectors that encode 
such PKSs in which the KS domain of the first extender module has been inactivated. 
Such constructs are especially usefiil when placed in translational reading frame with the 
remaining modules and domains of an FK-520 or FK-520 derivative PKS. The utility of 
these constructs is that host cells expressing, or cell free extracts containing, the PKS 
encoded thereby can be fed or supplied with N-acylcysteamine thioesters of novel 
precursor molecules to prepare FK-520 derivatives. See U.S. patent application Serial 
No. 60/1 17,384, filed 27 Jan. 1999, and PCT patent publication Nos. US97/02358 and 
US99/03986, each of which is incorporated herein by reference. 

The second extender module of the FK-520 PKS includes a KS, an AT specific 
for methylmalonyl CoA, a KR, an inactive DH, and an ACP. The recombinant DNA 
compounds of the invention that encode the second extender module of the FK-520 PKS 
and the corresponding polypeptides encoded thereby are useful for a variety of 
applications. In one embodiment, a DNA compound comprising a sequence that encodes 
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the FK-520 second extender module is inserted into a DNA compound that comprises the 
coding sequence for a heterologous PKS. The resulting construct, in which the coding 
sequence for a module of the heterologous PKS is either replaced by that for the second 
extender module of the FK-520 PKS or the latter is merely added to coding sequences for 
5 the modules of the heterologous PKS, provides a novel PKS coding sequence. In another 
embodiment, a DNA compound comprising a sequence that encodes the second extender 
module of the FK-520 PKS is inserted into a DNA compound that comprises the coding 
sequence for the remainder of the FK-520 PKS or a recombinant FK-520 PKS that 
produces an FK-520 derivative. 
10 In another embodiment, all or a portion of the second extender module coding 

sequence is utilized in conjunction with other PKS coding sequences to create a hybrid 

fyB module. In this embodiment, the invention provides, for example, either replacing the 

JE 

p methylmalonyl CoA specific AT with a malonyl CoA, ethylmalonyl CoA, or 2- 

W hydroxymalonyl CoA specific AT; deleting the KR and/or the inactive DH; replacing the 

ff\ 15 KR with another KR; and/or inserting an active DH or an active DH and an ER. In 
q addition, the KS and/or ACP can be replaced with another KS and/or ACP. In each of 

98 these replacements or insertions, the heterologous KS, AT, DH, KR, ER, or ACP coding 

\j sequence can originate from a coding sequence for another module of the FK-520 PKS, 

2 from a coding sequence for a PKS that produces a polyketide other than FK-520, or from 

20 chemical synthesis. The resulting heterologous second extender module coding sequence 
can be utilized in conjunction with a coding sequence from a PKS that synthesizes FK- 
520, an FK-520 derivative, or another polyketide. In similar fashion, the corresponding 
domains in a module of a heterologous PKS can be replaced by one or more domains of 
the second extender module of the FK-520 PKS. 
25 The third extender module of the FK-520 PKS includes a KS, an AT specific for 

malonyl CoA, a KR, an inactive DH, and an ACP. The recombinant DNA compounds of 
the invention that encode the third extender module of the FK-520 PKS and the 
corresponding polypeptides encoded thereby are useful for a variety of applications. In 
one embodiment, a DNA compound comprising a sequence that encodes the FK-520 
30 third extender module is inserted into a DNA compound that comprises the coding 
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sequence for a heterologous PKS. The resulting construct, in which the coding sequence 
for a module of the heterologous PKS is either replaced by that for the third extender 
module of the FK-520 PKS or the latter is merely added to coding sequences for the 
modules of the heterologous PKS, provides a novel PKS coding sequence. In another 
5 embodiment, a DNA compound comprising a sequence that encodes the third extender 
module of the FK-520 PKS is inserted into a DNA compound that comprises the coding 
sequence for the remainder of the FK-520 PKS or a recombinant FK-520 PKS that 
produces an FK-520 derivative. 

In another embodiment, all or a portion of the third extender module coding 
10 sequence is utilized in conjunction with other PKS coding sequences to create a hybrid 
module. In this embodiment, the invention provides, for example, either replacing the 

%Q malonyl CoA specific AT with a methylmalonyl CoA, ethylmalonyl CoA, or 2- 

£ 

2j hydroxymalonyl CoA specific AT; deleting the KR and/or the inactive DH; replacing the 

W KR with another KR; and/or inserting an active DH or an active DH and an ER. In 

M> ■ 

jyj 1 5 addition, the KS and/or ACP can be replaced with another KS and/or ACP. In each of 
3 these replacements or insertions, the heterologous KS, AT, DH, KR, ER, or ACP coding 

03 sequence can originate from a coding sequence for another module of the FK-520 PKS, 

% l from a coding sequence for a PKS that produces a polyketide other than FK-520, or from 

chemical synthesis. The resulting heterologous third extender module coding sequence 
20 can be utilized in conjunction with a coding sequence from a PKS that synthesizes FK- 
520, an FK-520 derivative, or another polyketide. In similar fashion, the corresponding 
domains in a module of a heterologous PKS can be replaced by one or more domains of 
the third extender module of the FK-520 PKS. 

The fourth extender module of the FK-520 PKS includes a KS, an AT that binds 
25 ethylmalonyl CoA, an inactive DH, and an ACP. The recombinant DNA compounds of 
the invention that encode the fourth extender module of the FK-520 PKS and the 
corresponding polypeptides encoded thereby are useful for a variety of applications. In 
one embodiment, a DNA compound comprising a sequence that encodes the FK-520 
fourth extender module is inserted into a DNA compound that comprises the coding 
30 sequence for a heterologous PKS. The resulting construct, in which the coding sequence 
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for a module of the heterologous PKS is either replaced by that for the fourth extender 
module of the FK-520 PKS or the latter is merely added to coding sequences for the 
modules of the heterologous PKS, provides a novel PKS coding sequence. In another 
embodiment, a DNA compound comprising a sequence that encodes the fourth extender 
module of the FK-520 PKS is inserted into a DNA compound that comprises the 
remainder of the coding sequence for the FK-520 PKS or a recombinant FK-520 PKS 
that produces an FK-520 derivative. 

In another embodiment, a portion of the fourth extender module coding sequence 
is utilized in conjunction with other PKS coding sequences to create a hybrid module. In 
this embodiment, the invention provides, for example, either replacing the ethylmalonyl 
CoA specific AT with a malonyl CoA, methylmalonyl CoA, or 2-hydroxymalonyl CoA 
specific AT; and/or deleting the inactive DH, inserting a KR, a KR and an active DH, or a 
KR, an active DH, and an ER. In addition, the KS and/or ACP can be replaced with 
another KS and/or ACP. In each of these replacements or insertions, the heterologous KS, 
AT, DH, KR, ER, or ACP coding sequence can originate from a coding sequence for 
another module of the FK-520 PKS, a PKS for a polyketide other than FK-520, or from 
chemical synthesis. The resulting heterologous fourth extender module coding sequence 
can be utilized in conjunction with a coding sequence for a PKS that synthesizes FK-520, 
an FK-520 derivative, or another polyketide. In similar fashion, the corresponding 
domains in a module of a heterologous PKS can be replaced by one or more domains of 
the fourth extender module of the FK-520 PKS. 

As illustrative examples, the present invention provides recombinant genes, 
vectors, and host cells that result from the conversion of the FK-506 PKS to an FK-520 
PKS and vice-versa. In one embodiment, the invention provides a recombinant set of FK- 
506 PKS genes but in which the coding sequences for the fourth extender module or at 
least those for the AT domain in the fourth extender module have been replaced by those 
for the AT domain of the fourth extender module of the FK-520 PKS. This recombinant 
PKS can be used to produce FK-520 in recombinant host cells. In another embodiment, 
the invention provides a recombinant set of FK-520 PKS genes but in which the coding 
sequences for the fourth extender module or at least those for the AT domain in the fourth 
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extender module have been replaced by those for the AT domain of the fourth extender 
module of the FK-506 PKS. This recombinant PKS can be used to produce FK-506 in 
recombinant host cells. 

Other examples of hybrid PKS enzymes of the invention include those in which 
5 the AT domain of module 4 has been replaced with a malonyl specific AT domain to 
provide a PKS that produces 21-desethyl-FK520 or with a methylmalonyl specific AT 
domain to provide a PKS that produces 21-desethyl-21-methyl-FK520. Another hybrid 
PKS of the invention is prepared by replacing the AT and inactive KR domain of FK-520 
extender module 4 with a methylmalonyl specific AT and an active KR domain, such as, 

10 for example, from module 2 of the DEBS or oleandolide PKS enzymes, to produce 21- 
desethyI-21-methyl-22-desoxo-22-hydroxy-FK520. The compounds produced by these 
hybrid PKS enzymes are neurotrophins. 

The fifth extender module of the FK-520 PKS includes a KS, an AT that binds 
methylmalonyl CoA, a DH, a KR, and an ACP. The recombinant DNA compounds of the 

15 invention that encode the fifth extender module of the FK-520 PKS and the 

corresponding polypeptides encoded thereby are useful for a variety of applications. In 
one embodiment, a DNA compound comprising a sequence that encodes the FK-520 fifth 
extender module is inserted into a DNA compound that comprises the coding sequence 
for a heterologous PKS. The resulting construct, in which the coding sequence for a 

20 module of the heterologous PKS is either replaced by that for the fifth extender module 
of the FK-520 PKS or the latter is merely added to coding sequences for the modules of 
the heterologous PKS, provides a novel PKS. In another embodiment, a DNA compound 
comprising a sequence that encodes the fifth extender module of the FK-520 PKS is 
inserted into a DNA compound that comprises the coding sequence for the FK-520 PKS 

25 or a recombinant FK-520 PKS that produces an FK-520 derivative. 

In another embodiment, a portion of the fifth extender module coding sequence is 
utilized in conjunction with other PKS coding sequences to create a hybrid module. In 
this embodiment, the invention provides, for example, either replacing the methylmalonyl 
CoA specific AT with a malonyl CoA, ethylmalonyl CoA, or 2-hydroxymalonyl CoA 

30 specific AT; deleting any one or both of the DH and KR; replacing any one or both of the 
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DH and KR with either a KR and/or DH; and/or inserting an ER. In addition, the KS 
and/or ACP can be replaced with another KS and/or ACP. In each of these replacements 
or insertions, the heterologous KS, AT, DH, KR, ER, or ACP coding sequence can 
originate from a coding sequence for another module of the FK-520 PKS, from a coding 
5 sequence for a PKS that produces a polyketide other than FK-520, or from chemical 
synthesis. The resulting heterologous fifth extender module coding sequence can be 
utilized in conjunction with a coding sequence for a PKS that synthesizes FK-520, an 
FK-520 derivative, or another polyketide. In similar fashion, the corresponding domains 
in a module of a heterologous PKS can be replaced by one or more domains of the fifth 

1 0 extender module of the FK-520 PKS. 

In an illustrative embodiment, the present invention provides a set of recombinant 
FK-520 PKS genes in which the coding sequences for the DH domain of the fifth 
extender module have been deleted or mutated to render the DH non-functional. In one 
such mutated gene, the KR and DH coding sequences are replaced with those encoding 

1 5 only a KR domain from another PKS gene. The resulting PKS genes code for the 

expression of an FK-520 PKS that produces an FK-520 analog that lacks the C- 19 to C- 
20 double bond of FK-520 and has a C-20 hydroxyl group. Such analogs are preferred 
neurotrophins, because they have little or no immunosuppressant activity. This 
recombinant fifth extender module coding sequence can be combined with other coding 

20 sequences to make additional compounds of the invention. In an illustrative embodiment, 
the present invention provides a recombinant FK-520 PKS that contains both this fifth 
extender module and the recombinant fourth extender module described above that 
comprises the coding sequence for the fourth extender module AT domain of the FK-506 
PKS. The invention also provides recombinant host cells derived from FK-506 producing 

25 host cells that have been mutated to prevent production of FK-506 but that express this 
recombinant PKS and so synthesize the corresponding (lacking the C- 19 to C-20 double 
bond of FK-506 and having a C-20 hydroxyl group) FK-506 derivative. In another 
embodiment, the present invention provides a recombinant FK-506 PKS in which the DH 
domain of module 5 has been deleted or otherwise rendered inactive and thus produces 

30 this novel polyketide. 
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The sixth extender module of the FK-520 PKS includes a KS, an AT specific for 
methylmalonyl CoA, a KR, a DH, an ER, and an ACP. The recombinant DNA 
compounds of the invention that encode the sixth extender module of the FK-520 PKS 
and the corresponding polypeptides encoded thereby are useful for a variety of 
applications. In one embodiment, a DNA compound comprising a sequence that encodes 
the FK-520 sixth extender module is inserted into a DNA compound that comprises the 
coding sequence for a heterologous PKS. The resulting construct, in which the coding 
sequence for a module of the heterologous PKS is either replaced by that for the sixth 
extender module of the FK-520 PKS or the latter is merely added to coding sequences for 
the modules of the heterologous PKS, provides a novel PKS coding sequence. In another 
embodiment, a DNA compound comprising a sequence that encodes the sixth extender 
module of the FK-520 PKS is inserted into a DNA compound that comprises the coding 
sequence for the remainder of the FK-520 PKS or a recombinant FK-520 PKS that 
produces an FK-520 derivative. 

In another embodiment, a portion of the sixth extender module coding sequence is 
utilized in conjunction with other PKS coding sequences to create a hybrid module. In 
this embodiment, the invention provides, for example, either replacing the methylmalonyl 
CoA specific AT with a malonyl CoA, ethylmalonyl CoA, or 2-hydroxymalonyl CoA 
specific AT; deleting any one, two, or all three of the KR, DH, and ER; and/or replacing 
any one, two, or all three of the KR, DH, and ER with another KR, DH, and ER. In 
addition, the KS and/or ACP can be replaced with another KS and/or ACP. In each of 
these replacements, the heterologous KS, AT, DH, KR, ER, or ACP coding sequence can 
originate from a coding sequence for another module of the FK-520 PKS, from a coding 
sequence for a PKS that produces a polyketide other than FK-520, or from chemical 
synthesis. The resulting heterologous sixth extender module coding sequence can be 
utilized in conjunction with a coding sequence for a PKS that synthesizes FK-520, an 
FK-520 derivative, or another polyketide. In similar fashion, the corresponding domains 
in a module of a heterologous PKS can be replaced by one or more domains of the sixth 
extender module of the FK-520 PKS. 
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In an illustrative embodiment, the present invention provides a set of recombinant 
FK-520 PKS genes in which the coding sequences for the DH and ER domains of the 
sixth extender module have been deleted or mutated to render them non-functional. In 
one such mutated gene, the KR, ER, and DH coding sequences are replaced with those 
encoding only a KR domain from another PKS gene. This can also be accomplished by 
simply replacing the coding sequences for extender module six with those for an extender 
module having a methylmalonyl specific AT and only a KR domain from a heterologous 
PKS gene, such as, for example, the coding sequences for extender module two encoded 
by the eryAI gene. The resulting PKS genes code for the expression of an FK-520 PKS 
that produces an FK-520 analog that has a C- 18 hydroxyl group. Such analogs are 
preferred neurotrophins, because they have little or no immunosuppressant activity. This 
recombinant sixth extender module coding sequence can be combined with other coding 
sequences to make additional compounds of the invention. In an illustrative embodiment, 
the present invention provides a recombinant FK-520 PKS that contains both this sixth 
extender module and the recombinant fourth extender module described above that 
comprises the coding sequence for the fourth extender module AT domain of the FK-506 
PKS. The invention also provides recombinant host cells derived from FK-506 producing 
host cells that have been mutated to prevent production of FK-506 but that express this 
recombinant PKS and so synthesize the corresponding (having a C-18 hydroxyl group) 
FK-506 derivative. In another embodiment, the present invention provides a recombinant 
FK-506 PKS in which the DH and ER domains of module 6 have been deleted or 
otherwise rendered inactive and thus produces this novel polyketide. 

The seventh extender module of the FK-520 PKS includes a KS, an AT specific 
for 2-hydroxymalonyl CoA, a KR, a DH, an ER, and an ACP. The recombinant DNA 
compounds of the invention that encode the seventh extender module of the FK-520 PKS 
and the corresponding polypeptides encoded thereby are useful for a variety of 
applications. In one embodiment, a DNA compound comprising a sequence that encodes 
the FK-520 seventh extender module is inserted into a DNA compound that comprises 
the coding sequence for a heterologous PKS. The resulting construct, in which the coding 
sequence for a module of the heterologous PKS is either replaced by that for the seventh 
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extender module of the FK-520 PKS or the latter is merely added to coding sequences for 
the modules of the heterologous PKS, provides a novel PKS coding sequence. In another 
embodiment, a DNA compound comprising a sequence that encodes the seventh extender 
module of the FK-520 PKS is inserted into a DNA compound that comprises the coding 
5 sequence for the remainder of the FK-520 PKS or a recombinant FK-520 PKS that 
produces an FK-520 derivative. 

In another embodiment, a portion or all of the seventh extender module coding 
sequence is utilized in conjunction with other PKS coding sequences to create a hybrid 
module. In this embodiment, the invention provides, for example, either replacing the 2- 

10 hydroxymalonyl CoA specific AT with a methylmalonyl Co A, ethylmalonyl CoA, or 
malonyl CoA specific AT; deleting the KR, the DH, and/or the ER; and/or replacing the 
KR, DH, and/or ER. In addition, the KS and/or ACP can be replaced with another KS 
and/or ACP. In each of these replacements or insertions, the heterologous KS, AT, DH, 
KR, ER, or ACP coding sequence can originate from a coding sequence for another 

1 5 module of the FK-520 PKS, from a coding sequence for a PKS that produces a polyketide 
other than FK-520, or from chemical synthesis. The resulting heterologous seventh 
extender module coding sequence can be utilized in conjunction with a coding sequence 
for a PKS that synthesizes FK-520, an FK-520 derivative, or another polyketide. In 
similar fashion, the corresponding domains in a module of a heterologous PKS can be 

20 replaced by one or more domains of the seventh extender module of the FK-520 PKS. 

In an illustrative embodiment, the present invention provides a set of recombinant 
FK-520 PKS genes in which the coding sequences for the AT domain of the seventh 
extender module has been replaced with those encoding an AT domain for malonyl, 
methylmalonyl, or ethylmalonyl CoA from another PKS gene. The resulting PKS genes 

25 code for the expression of an FK-520 PKS that produces an FK-520 analog that lacks the 
C-15 methoxy group, having instead a hydrogen, methyl, or ethyl group at that position, 
respectively. Such analogs are preferred, because they are more slowly metabolized than 
FK-520. This recombinant seventh extender module coding sequence can be combined 
with other coding sequences to make additional compounds of the invention. In an 

30 illustrative embodiment, the present invention provides a recombinant FK-520 PKS that 
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contains both this seventh extender module and the recombinant fourth extender module 
described above that comprises the coding sequence for the fourth extender module AT 
domain of the FK-506 PKS. The invention also provides recombinant host cells derived 
from FK-506 producing host cells that have been mutated to prevent production of FK- 
506 but that express this recombinant PKS and so synthesize the corresponding (C-15- 
desmethoxy) FK-506 derivative. In another embodiment, the present invention provides a 
recombinant FK-506 PKS in which the AT domain of module 7 has been replaced and 
thus produces this novel polyketide. 

In another illustrative embodiment, the present invention provides a hybrid PKS 
in which the AT and KR domains of module 7 of the FK-520 PKS are replaced by a 
methylmalonyl specific AT domain and an inactive KR domain, such as, for example, the 
AT and KR domains of extender module 6 of the rapamycin PKS. The resulting hybrid 
PKS produces 15-desmethoxy- 15 -methyl- 16-oxo-FK-520, a neurotrophin compound. 

The eighth extender module of the FK-520 PKS includes a KS, an AT specific for 
2-hydroxymalonyl CoA, a KR, and an ACP. The recombinant DNA compounds of the 
invention that encode the eighth extender module of the FK-520 PKS and the 
corresponding polypeptides encoded thereby are useful for a variety of applications. In 
one embodiment, a DNA compound comprising a sequence that encodes the FK-520 
eighth extender module is inserted into a DNA compound that comprises the coding 
sequence for a heterologous PKS. The resulting construct, in which the coding sequence 
for a module of the heterologous PKS is either replaced by that for the eighth extender 
module of the FK-520 PKS or the latter is merely added to coding sequences for the 
modules of the heterologous PKS, provides a novel PKS coding sequence. In another 
embodiment, a DNA compound comprising a sequence that encodes the eighth extender 
module of the FK-520 PKS is inserted into a DNA compound that comprises the coding 
sequence for the remainder of the FK-520 PKS or a recombinant FK-520 PKS that 
produces an FK-520 derivative. 

In another embodiment, a portion of the eighth extender module coding sequence 
is utilized in conjunction with other PKS coding sequences to create a hybrid module. In 
this embodiment, the invention provides, for example, either replacing the 2- 
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hydroxymalonyl CoA specific AT with a methylmalonyl CoA, ethylmalonyl CoA, or 
malonyl CoA specific AT; deleting or replacing the KR; and/or inserting a DH or a DH 
and an ER. In addition, the KS and/or ACP can be replaced with another KS and/or ACP. 
In each of these replacements, the heterologous KS, AT, DH, KR, ER, or ACP coding 
sequence can originate from a coding sequence for another module of the FK-520 PKS, 
from a coding sequence for a PKS that produces a polyketide other than FK-520, or from 
chemical synthesis. The resulting heterologous eighth extender module coding sequence 
can be utilized in conjunction with a PKS that synthesizes FK-520, an FK-520 derivative, 
or another polyketide. In similar fashion, the corresponding domains in a module of a 
heterologous PKS can be replaced by one or more domains of the eighth extender module 
of the FK-520 PKS. 

In an illustrative embodiment, the present invention provides a set of recombinant 
FK-520 PKS genes in which the coding sequences for the AT domain of the eighth 
extender module has been replaced with those encoding an AT domain for malonyl, 
methylmalonyl, or ethylmalonyl CoA from another PKS gene. The resulting PKS genes 
code for the expression of an FK-520 PKS that produces an FK-520 analog that lacks the 
C-13 methoxy group, having instead a hydrogen, methyl, or ethyl group at that position, 
respectively. Such analogs are preferred, because they are more slowly metabolized than 
FK-520. This recombinant eighth extender module coding sequence can be combined 
with other coding sequences to make additional compounds of the invention. In an 
illustrative embodiment, the present invention provides a recombinant FK-520 PKS that 
contains both this eighth extender module and the recombinant fourth extender module 
described above that comprises the coding sequence for the fourth extender module AT 
domain of the FK-506 PKS. The invention also provides recombinant host cells derived 
from FK-506 producing host cells that have been mutated to prevent production of FK- 
506 but that express this recombinant PKS and so synthesize the corresponding (C-13- 
desmethoxy) FK-506 derivative. In another embodiment, the present invention provides a 
recombinant FK-506 PKS in which the AT domain of module 8 has been replaced and 
thus produces this novel polyketide. 
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The ninth extender module of the FK-520 PKS includes a KS, an AT specific for 
methylmalonyl CoA, a KR, a DH, an ER, and an ACP. The recombinant DNA 
compounds of the invention that encode the ninth extender module of the FK-520 PKS 
and the corresponding polypeptides encoded thereby are useful for a variety of 
applications. In one embodiment, a DNA compound comprising a sequence that encodes 
the FK-520 ninth extender module is inserted into a DNA compound that comprises the 
coding sequence for a heterologous PKS. The resulting construct, in which the coding 
sequence for a module of the heterologous PKS is either replaced by that for the ninth 
extender module of the FK-520 PKS or the latter is merely added to coding sequences for 
the modules of the heterologous PKS, provides a novel PKS coding sequence. In another 
embodiment, a DNA compound comprising a sequence that encodes the ninth extender 
module of the FK-520 PKS is inserted into a DNA compound that comprises the coding 
sequence for the remainder of the FK-520 PKS or a recombinant FK-520 PKS that 
produces an FK-520 derivative. 

In another embodiment, a portion of the ninth extender module coding sequence 
is utilized in conjunction with other PKS coding sequences to create a hybrid module. In 
this embodiment, the invention provides, for example, either replacing the methylmalonyl 
CoA specific AT with a malonyl CoA, ethylmalonyl CoA, or 2-hydroxymalonyl CoA 
specific AT; deleting any one, two, or all three of the KR, DH, and ER; and/or replacing 
any one, two, or all three of the KR, DH, and ER with another KR, DH, and/or ER. In 
addition, the KS and/or ACP can be replaced with another KS and/or ACP. In each of 
these replacements, the heterologous KS, AT, DH, KR, ER, or ACP coding sequence can 
originate from a coding sequence for another module of the FK-520 PKS, from a coding 
sequence for a PKS that produces a polyketide other than FK-520, or from chemical 
synthesis. The resulting heterologous ninth extender module coding sequence can be 
utilized in conjunction with a PKS that synthesizes FK-520, an FK-520 derivative, or 
another polyketide. In similar fashion, the corresponding domains in a module of a 
heterologous PKS can be replaced by one or more domains of the ninth extender module 
of the FK-520 PKS. 
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The tenth extender module of the FK-520 PKS includes a KS, an AT specific for 
malonyl CoA, and an ACP. The recombinant DNA compounds of the invention that 
encode the tenth extender module of the FK-520 PKS and the corresponding polypeptides 
encoded thereby are useful for a variety of applications. In one embodiment, a DNA 
compound comprising a sequence that encodes the FK-520 tenth extender module is 
inserted into a DNA compound that comprises the coding sequence for a heterologous 
PKS. The resulting construct, in which the coding sequence for a module of the 
heterologous PKS is either replaced by that for the tenth extender module of the FK-520 
PKS or the latter is merely added to coding sequences for the modules of the 
heterologous PKS, provides a novel PKS coding sequence. In another embodiment, a 
DNA compound comprising a sequence that encodes the tenth extender module of the 
FK-520 PKS is inserted into a DNA compound that comprises the coding sequence for 
the remainder of the FK-520 PKS or a recombinant FK-520 PKS that produces an FK- 
520 derivative. 

In another embodiment, a portion or all of the tenth extender module coding 
sequence is utilized in conjunction with other PKS coding sequences to create a hybrid 
module. In this embodiment, the invention provides, for example, either replacing the 
malonyl CoA specific AT with a methylmalonyl CoA, ethylmalonyl CoA, or 2- 
hydroxymalonyl CoA specific AT; and/or inserting a KR, a KR and DH, or a KR, DH, 
and an ER. In addition, the KS and/or ACP can be replaced with another KS and/or ACP. 
In each of these replacements or insertions, the heterologous KS, AT, DH, KR, ER, or 
ACP coding sequence can originate from a coding sequence for another module of the 
FK-520 PKS, from a coding sequence for a PKS that produces a polyketide other than 
FK-520, or from chemical synthesis. The resulting heterologous tenth extender module 
coding sequence can be utilized in conjunction with a coding sequence for a PKS that 
synthesizes FK-520, an FK-520 derivative, or another polyketide. In similar fashion, the 
corresponding domains in a module of a heterologous PKS can be replaced by one or 
more domains of the tenth extender module of the FK-520 PKS. 

The FK-520 polyketide precursor produced by the action of the tenth extender 
module of the PKS is then attached to pipecolic acid and cyclized to form FK-520. The 
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enzyme FkbP is the NRPS like enzyme that catalyzes these reactions. FkbP also includes 
a thioesterase activity that cleaves the nascent FK-520 polyketide from the NRPS. The 
present invention provides recombinant DNA compounds that encode the flcbP gene and 
so provides recombinant methods for expressing the JkbP gene product in recombinant 
host cells. The recombinant fkbP genes of the invention include those in which the coding 
sequence for the adenylation domain has been mutated or replaced with coding sequences 
from other NRPS like enzymes so that the resulting recombinant FkbP incorporates a 
moiety other than pipecolic acid. For the construction of host cells that do not naturally 
produce pipecolic acid, the present invention provides recombinant DNA compounds that 
express the enzymes that catalyze at least some of the biosynthesis of pipecolic acid (see 
Nielsen et a/., 1991, Biochem. 30: 5789-96). The flcbL gene encodes a homolog of RapL, 
a lysine cyclodeaminase responsible in part for producing the pipecolate unit added to the 
end of the polyketide chain. The flcbB and fkbL recombinant genes of the invention can be 
used in heterologous hosts to produce compounds such as FK-520 or, in conjunction with 
other PKS or NRPS genes, to produce known or novel polyketides and non-ribosmal 
peptides. 

The present invention also provides recombinant DNA compounds that encode 
the P450 oxidase and methyltransferase genes involved in the biosynthesis of FK-520. 
Figure 2 shows the various sites on the FK-520 polyketide core structure at which these 
enzymes act. By providing these genes in recombinant form, the present invention 
provides recombinant host cells that can produce FK-520. This is accomplished by 
introducing the recombinant PKS, P450 oxidase, and methyltransferase genes into a 
heterologous host cell. In a preferred embodiment, the heterologous host cell is 
Streptomyces coelicolor CH999 or Streptomyces lividans K4-1 14, as described in U.S. 
Patent No. 5,830,750 and U.S. patent application Serial Nos. 08/828,898, filed 31 Mar. 
1997, and 09/181,833, filed 28 Oct. 1998, each of which is incorporated herein by 
reference. In addition, by providing recombinant host cells that express only a subset of 
these genes, the present invention provides methods for making FK-520 precursor 
compounds not readily obtainable by other means. 
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In a related aspect, the present invention provides recombinant DNA compounds 
and vectors that are useful in generating, by homologous recombination, recombinant 
host cells that produce FK-520 precursor compounds. In this aspect of the invention, a 
native host cell that produces FK-520 is transformed with a vector (such as an SCP2* 
derived vector for Streptomyces host cells) that encodes one or more disrupted genes (i.e., 
a hydroxylase, a methyltransferase, or both) or merely flanking regions from those genes. 
When the vector integrates by homologous recombination, the native, functional gene is 
deleted or replaced by the non-functional recombinant gene, and the resulting host cell 
thus produces an FK-520 precursor. Such host cells can also be complemented by 
introduction of a modified form of the deleted or mutated non-functional gene to produce 
a novel compound. 

In one important embodiment, the present invention provides a hybrid PKS and 
the corresponding recombinant DNA compounds that encode those hybrid PKS enzymes. 
For purposes of the present invention a hybrid PKS is a recombinant PKS that comprises 
all or part of one or more modules and thioesterase/cyclase domain of a first PKS and all 
or part of one or more modules, loading module, and thioesterase/cyclase domain of a 
second PKS. In one preferred embodiment, the first PKS is all or part of the FK-520 
PKS, and the second PKS is only a portion or all of a non-FK-520 PKS. 

One example of the preferred embodiment is an FK-520 PKS in which the AT 
domain of module 8, which specifies a hydroxymalonyl Co A and from which the C-13 
methoxy group of FK-520 is derived, is replaced by an AT domain that specifies a 
malonyl, methylmalonyl, or ethylmalonyl CoA. Examples of such replacement AT 
domains include the AT domains from modules 3, 12, and 13 of the rapaymycin PKS and 
from modules 1 and 2 of the erythromycin PKS. Such replacements, conducted at the 
level of the gene for the PKS, are illustrated in the examples below. Another illustrative 
example of such a hybrid PKS includes an FK-520 PKS in which the natural loading 
module has been replaced with a loading module of another PKS. Another example of 
such a hybrid PKS is an FK-520 PKS in which the AT domain of module three is 
replaced with an AT domain that binds methylmalonyl CoA. 
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In another preferred embodiment, the first PKS is most but not all of a non-FK- 
520 PKS, and the second PKS is only a portion or all of the FK-520 PKS. An illustrative 
example of such a hybrid PKS includes an erythromycin PKS in which an AT specific for 
methylmalonyl CoA is replaced with an AT from the FK-520 PKS specfic for malonyl 
CoA. 

Those of skill in the art will recognize that all or part of either the first or second 
PKS in a hybrid PKS of the invention need not be isolated from a naturally occurring 
source. For example, only a small portion of an AT domain determines its specificity. See 
U.S. provisional patent application Serial No. 60/091,526, incorporated herein by 
reference. The state of the art in DNA synthesis allows the artisan to construct de novo 
DNA compounds of size sufficient to construct a useful portion of a PKS module or 
domain. For purposes of the present invention, such synthetic DNA compounds are 
deemed to be a portion of a PKS. 

Thus, the hybrid modules of the invention are incorporated into a PKS to provide 
a hybrid PKS of the invention. A hybrid PKS of the invention can result not only: 

(i) from fusions of heterologous domain (where heterologous means the domains 
in that module are from at least two different naturally occurring modules) coding 
sequences to produce a hybrid module coding sequence contained in a PKS gene whose 
product is incorporated into a PKS, 

but also: 

(ii) from fusions of heterologous module (where heterologous module means two 
modules are adjacent to one another that are not adjacent to one another in naturally 
occurring PKS enzymes) coding sequences to produce a hybrid coding sequence 
contained in a PKS gene whose product is incorporated into a PKS, 

(iii) from expression of one or more FK-520 PKS genes with one or more non- 
FK-520 PKS genes, including both naturally occurring and recombinant non-FK-520 
PKS genes, and 

(iv) from combinations of the foregoing. 

Various hybrid PKSs of the invention illustrating these various alternatives are described 
herein. 
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Examples of the production of a hybrid PKS by co-expression of PKS genes from 
the FK-520 PKS and another non-FK-520 PKS include hybrid PKS enzymes produced by 
coexpression of FK-520 and rapamycin PKS genes. Preferably, such hybrid PKS 
enzymes are produced in recombinant Streptomyces host cells that produce FK-520 or 
FK-506 but have been mutated to inactivate the gene whose function is to be replaced by 
the rapamycin PKS gene introduced to produce the hybrid PKS. Particular examples 
include (i) replacement of the flcbC gene with the rapB gene; and (ii) replacement of the 
fkbA gene with the rapC gene. The latter hybrid PKS produces 13,15-didesmethoxy-FK- 
520, if the host cell is an FK-520 producing host cell, and 13,15-didesmethoxy-FK-506, 
if the host cell is an FK-506 producing host cell. The compounds produced by these 
hybrid PKS enzymes are immunosuppressants and neurotrophins but can be readily 
modified to act only as neurotrophins, as described in Example 6, below. 

Other illustrative hybrid PKS enzymes of the invention are prepared by replacing 
the fkbA gene of an FK-520 or FK-506 producing host cell with a hybrid fkbA gene in 
which: (a) the extender module 8 through 10, inclusive, coding sequences have been 
replaced by the coding sequnces for extender modules 12 to 14, inclusive, of the 
rapamycin PKS; and (b) the module 8 coding sequences have been replaced by the 
module 8 coding sequence of the rifamycin PKS. When expressed with the other, 
naturally occurring FK-520 or FK-506 PKS genes and the genes of the modification 
enzymes, the resulting hybrid PKS enzymes produce, respectively, (a) 13-desmethoxy- 
FK-520 or 1 3-desmethoxy-FK-506; and (b) 13-desmethoxy-13-methyl-FK-520 or 13- 
desmethoxy-13-methyl-FK-506. In a preferred embodiment, these recombinant PKS 
genes of the invention are introduced into the producing host cell by a vector such as 
pHU204, which is a plamsid pRM5 derivative that has the well-characterized SCP2* 
replicon, the colEl replicon, the tsr and bla resistance genes, and a cos site. This vector 
can be used to introduce the recombinant fkbA replacement gene in an FK-520 or FK-506 
producing host cell (or a host cell derived therefrom in which the endogenous fkbA gene 
has either been rendered inactive by mutation, deletion or homologous recombination 
with the gene that replaces it) to produce the desired hybrid PKS. 
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In constructing hybrid PKSs of the invention, certain general methods may be 
helpful. For example, it is often beneficial to retain the framework of the module to be 
altered to make the hybrid PKS. Thus, if one desires to add DH and ER functionalities to 
a module, it is often preferred to replace the KR domain of the original module with a 
KR, DH, and ER domain-containing segment from another module, instead of merely 
inserting DH and ER domains. One can alter the stereochemical specificity of a module 
by replacement of the KS domain with a KS domain from a module that specifies a 
different stereochemistry. See Lau et al, 1999, "Dissecting the role of acyltransferase 
domains of modular polyketide synthases in the choice and stereochemical fate of 
extender units," Biochemistry 55(5): 1643-1651, incorporated herein by reference. 
Stereochemistry can also be changed by changing the KR domain. Also, one can alter the 
specificity of an AT domain by changing only a small segment of the domain. See Lau et 
aL, supra. One can also take advantage of known linker regions in PKS proteins to link 
modules from two different PKSs to create a hybrid PKS. See Gokhale et al, 9 16 Apr. 
1999, "Dissecting and Exploiting Intermodular Communication in Polyketide Synthases," 
Science 284: 482-485, incorporated herein by reference. 

The following Table lists references describing illustrative PKS genes and 
corresponding enzymes that can be utilized in the construction of the recombinant PKSs 
and the corresponding DNA compounds that encode them of the invention. Also 
presented are various references describing tailoring enzymes and corresponding genes 
that can be employed in accordance with the methods of the present invention. 
Avermectin 

U.S. Pat. No. 5,252,474 to Merck. 

MacNeil et al , 1 993, Industrial Microorganisms: Basic and Applied Molecular 
Genetics , Baltz, Hegeman, & Skatrud, eds. (ASM), pp. 245-256, A Comparison of the 
Genes Encoding the Polyketide Synthases for Avermectin, Erythromycin, and 
Nemadectin. 

MacNeil et al, 1992, Gene 115: 1 19-125, Complex Organization of the 
Streptomyces avermitilis genes encoding the avermectin polyketide synthase. 
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Ikeda et al., Aug. 1999, Organization of the biosynthetic gene cluster for the 
polyketide anthelmintic macrolide avermectin in Streptomyces avermitilis, Proc. Natl. 
Acad. Sci. USA 96: 9509-9514. 
Candicidin (FR008) 

Hu etal., 1994, Mol. Microbiol. 14: 163-172. 
Epothilone 

U.S. Pat. App. Serial No. 60/130,560, filed 22 April 1 999. 
Erythromycin 

PCT Pub. No. 93/13663 to Abbott. 

US Pat. No. 5,824,513 to Abbott. 

Donadio etal, 1991, Science 252:675-9. 

Cortes et al., 8 Nov. 1990, Nature 348:\76-S, An unusually large 
multifunctional polypeptide in the erythromycin producing polyketide synthase of 
Saccharopolyspora erythraea. 

Glycosylation Enzymes 

PCT Pat. App. Pub. No. 97/23630 to Abbott. 
FK-506 

Motamedi et al., 1998, The biosynthetic gene cluster for the macrolactone ring of 
the immunosuppressant FK-506, Eur. J. biochem. 256: 528-534. 

Motamedi et al, 1997, Structural organization of a multifunctional polyketide 
synthase involved in the biosynthesis of the macrolide immunosuppressant FK-506, Eur. 
J. Biochem. 244: 74-80. 

Methyltransferase 

US 5,264,355, issued 23 Nov. 1993, Methylating enzyme from 
Streptomyces MA6858. 31-O-desmethyl-FK-506 methyltransferase. 

Motamedi et al., 1996, Characterization of methyltransferase and 
hydroxylase genes involved in the biosynthesis of the immunosuppressants FK-506 and 
FK-520, J. Bacteriol. 1 78: 5243-5248. 
Streptomyces hygroscopicus 

U.S. patent application Serial No. 09/154,083, filed 16 Sep. 1998. 
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Lovastatin 

U.S. Pat. No. 5,744,350 to Merck. 
Narbomycin 

U.S. patent application Serial No. 60/107,093, filed 5 Nov. 1998, and Serial No. 
60/120,254, filed 16 Feb. 1999. 
Nemadectin 

MacNeil et al , 1 993 , supra. 
Niddamycin 

Kakavas et aL, 1997, Identification and characterization of the niddamycin 
polyketide synthase genes from Streptomyces caelestis, J. Bacteriol 179: 7515-7522. 
Oleandomycin 

Swan et ai, 1994, Characterisation of a Streptomyces antibioticus gene encoding 
a type I polyketide synthase which has an unusual coding sequence, Mol Gen. Genet. 
242: 358-362. 

U.S. patent application Serial No. 60/120,254, filed 16 Feb. 1999. 

Olano et al, 1998, Analysis of a Streptomyces antibioticus chromosomal region 
involved in oleandomycin biosynthesis, which encodes two glycosyltransferases 
responsible for glycosylation of the macrolactone ring, Mol Gen. Genet. 259(2): 299- 
308. 

Picromycin 

PCT patent application US99/15047, filed 2 Jul. 1999. 
Xue et al, 1998, Hydroxy lation of macrolactones YC-17 and narbomycin is 
mediated by the p/AC-encoded cytochrome P450 in Streptomyces venezuelae, Chemistry 
& Biology 5(11): 661-667. 

Xue et al, Oct. 1998, A gene cluster for macrolide antibiotic biosynthesis in 
Streptomyces venezuelae: Architecture of metabolic diversity, Proc. Natl. Acad Sci. 
USA 95: 12111 12116. 
Platenolide 

EP Pat. App. Pub. No. 791,656 to Lilly. 
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Rapamycin 

Schwecke etaL, Aug. 1995, The biosynthetic gene cluster for the 
polyketide rapamycin, Proc, Natl Acad. Scl USA 92:7839-7843. 

Aparicio et aL, 1996, Organization of the biosynthetic gene cluster for rapamycin 
in Streptomyces hygroscopicus: analysis of the enzymatic domains in the modular 
polyketide synthase, Gene 169: 9-16. 
Rifamycin 

August et aL, 13 Feb. 1998, Biosynthesis of the ansamycin antibiotic rifamycin: 
deductions from the molecular analysis of the n/biosynthetic gene cluster of 
Amycolatopsis mediterranei S669, Chemistry & Biology, 5(2): 69-79. 
Sorangium PKS 

U.S. patent application Serial No. 09/144,085, filed 31 Aug. 1998. 
Soraphen 

U.S. Pat. No. 5,716,849 to Novartis. 

Schupp et aL, 1995, J. Bacteriology 177: 3673-3679. A Sorangium cellulosum 
(Mycobacterium) Gene Cluster for the Biosynthesis of the Macrolide Antibiotic 
Soraphen A: Cloning, Characterization, and Homology to Polyketide Synthase Genes 
from Actinomycetes. 
Spiramycin 

U.S. Pat. No. 5,098,837 to Lilly. 

Activator Gene 

U.S. Pat. No. 5,514,544 to Lilly. 
Tylosin 

EP Pub. No. 791,655 to Lilly. 
U.S. Pat. No. 5,876,991 to Lilly. 

Kuhstoss et aL, 1996, Gene J 83:23 1-6., Production of a novel polyketide through 
the construction of a hybrid polyketide synthase. 
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Tailoring enzymes 

Merson-Davies and Cundliffe, 1994, Mol Microbiol 13: 349-355. Analysis of 
five tylosin biosynthetic genes from the tylBA region of the Streptomyces fradiae 
genome. 

As the above Table illustrates, there are a wide variety of polyketide synthase 
genes that serve as readily available sources of DNA and sequence information for use in 
constructing the hybrid PKS-encoding DNA compounds of the invention. Methods for 
constructing hybrid PKS-encoding DNA compounds are described without reference to 
the FK-520 PKS in PCT patent publication No. 98/51695; U.S. Patent Nos. 5,672,491 
and 5,712,146 and U.S. patent application Serial Nos. 09/073,538, filed 6 May 1998, and 
09/141,908, filed 28 Aug 1998, each of which is incorporated herein by reference. 

The hybrid PKS-encoding DNA compounds of the invention can be and often are 
hybrids of more than two PKS genes. Moreover, there are often two or more modules in 
the hybrid PKS in which all or part of the module is derived from a second (or third) 
PKS. Thus, as one illustrative example, the present invention provides a hybrid FK-520 
PKS that contains the naturally occurring loading module and FkbP as well as modules 
one, two, four, six, seven, and eight, nine, and ten of the FK-520 PKS and further 
contains hybrid or heterologous modules three and five. Hybrid or heterologous module 
three contains an AT domain that is specific of methylmalonyl CoA and can be derived 
for example, from the erythromycin or rapamycin PKS genes. Hybrid or heterologous 
module five contains an AT domain that is specific for malonyl CoA and can be derived 
for example, from the picromycin or rapamycin PKS genes. 

While an important embodiment of the present invention relates to hybrid PKS 
enzymes and corresponding genes, the present invention also provides recombinant FK- 
520 PKS genes in which there is no second PKS gene sequence present but which differ 
from the FK-520 PKS gene by one or more deletions. The deletions can encompass one 
or more modules and/or can be limited to a partial deletion within one or more modules. 
When a deletion encompasses an entire module, the resulting FK-520 derivative is at 
least two carbons shorter than the gene from which it was derived. When a deletion is 
within a module, the deletion typically encompasses a KR, DH, or ER domain, or both 
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DH and ER domains, or both KR and DH domains, or all three KR, DH, and ER 
domains. 

To constat a hybrid PKS or FK-520 derivative PKS gene of the invention, one 
can employ a techrWie, described in PCT Pub. No. 98/27203 and U.S. patent application 
5 Serial No. 08/989,3 3 2Vfiled 1 1 Dec. 1997, each of which is incorporated herein by 
reference, in which the lWe PKS gene is divided into two or more, typically three, 
segments, and each segments placed on a separate expression vector. In this manner, 
each of the segments of the gehe can be altered, and various altered segments can be 
combined in a single host cell to provide a recombinant PKS gene of the invention. This 
10 technique makes more efficient the construction of large libraries of recombinant PKS 
genes, vectors for expressing those genes* and host cells comprising those vectors. 

Thus, in one important embodiment, the recombinant DNA compounds of the 
invention are expression vectors. As used herein, the term expression vector refers to any 
nucleic acid that can be introduced into a host cell or cell-free transcription and 
* % 15 translation medium. An expression vector can be maintained stably or transiently in a 
O cell, whether as part of the chromosomal or other DNA in the cell or in any cellular 

jy compartment, such as a replicating vector in the cytoplasm. An expression vector also 

comprises a gene that serves to produce RNA that is translated into a polypeptide in the 

Q 

I** cell or cell extract. Furthermore, expression vectors typically contain additional 

20 functional elements, such as resistance-conferring genes to act as selectable markers. 

The various components of an expression vector can vary widely, depending on 
the intended use of the vector. In particular, the components depend on the host cell(s) in 
which the vector will be used or is intended to function. Vector components for 
expression and maintenance of vectors in E; coli are widely known and commercially 
25 available, as are vector components for other commonly used organisms, such as yeast 
cells and Streptomyces cells. 

In a preferred embodiment, the expression vectors of the invention are used to 
construct recombinant Streptomyces host cells that express a recombinant PKS of the 
invention. Preferred Streptomyces host cell/vector combinations of the invention include 
30 S. coelicolor CH999 and S. lividans K4-1 14 host cells, which do not produce 
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actinorhodin, and expression vectors derived from the pRMl and pRM5 vectors, as 
described in U.S. Patent No. 5,830,750 and U.S. patent application Serial Nos. 
08/828,898, filed 31 Mar. 1997, and 09/181,833, filed 28 Oct. 1998, each of which is 
incorporated herein by reference. 

The present invention provides a wide variety of expression vectors for use in 
Streptomyces. For replicating vectors, the origin of replication can be, for example and 
without limitation, a low copy number vector, such as SCP2* (see Hopwood et al, 
Genetic Manipulation of Streptomyces: A Laboratory manual (The John Innes 
Foundation, Norwich, U.K., 1985); Lydiate et al, 1985, Gene 35: 223-235; and Kieser 
and Melton, 1988, Gene 65: 83-91, each of which is incorporated herein by reference), 
SLP1.2 (Thompson et al., 1982, Gene 20: 51-62, incorporated herein by reference), and 
SG5(ts) (Muth et al, 1989, Mol Gen. Genet. 219: 341-348, and Bierman et al, 1992, 
Gene 116: 43-49, each of which is incorporated herein by reference), or a high copy 
number vector, such as pIJlOl and pJVl (see Katz et al, 1983, J. Gen. Microbiol. 129: 
2703-2714; VaxaetaL, 1989,7. BacterioL 171: 5782-5781; and Servin-Gonzalez, 1993, 
Plasmid 30: 131-140, each of which is incorporated herein by reference). Generally, 
however, high copy number vectors are not preferred for expression of genes contained 
on large segments of DNA. For non-replicating and integrating vectors, it is useful to 
include at least an E. coli origin of replication, such as from pUC, p IP, pi I, and pBR. For 
phage based vectors, the phages phiC31 and KC515 can be employed (see Hopwood et 
al, supra). 

Typically, the expression vector will comprise one or more marker genes by 
which host cells containing the vector can be identified and/or selected. Useful antibiotic 
resistance conferring genes for use in Streptomyces host cells include the ermE (confers 
resistance to erythromycin and other macrolides and lincomycin), tsr (confers resistance 
to thiostrepton), aadA (confers resistance to spectinomycin and streptomycin), aacC4 
(confers resistance to apramycin, kanamycin, gentamicin, geneticin (G418), and 
neomycin), hyg (confers resistance to hygromycin), and vph (confers resistance to 
viomycin) resistance conferring genes. 
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The recombinant PKS gene on the vector will be under the control of a promoter, 
typically with an attendant ribosome binding site sequence. The present invention 
provides the endogenous promoters of the FK-520 PKS and related biosynthetic genes in 
recombinant form, and these promoters are preferred for use in the native hosts and in 
heterologous hosts in which the promoters function. A preferred promoter of the 
invention is the fkbO gene promoter, comprised in a sequence of about 270 bp between 
the start of the open reading frames of the JkbO and flcbB genes. The fkbO promoter is 
believed to be bi-directional in that it promotes transcription of the genes fkbO,jkbP,and 
fkbA in one direction tmdJkbB,flcbC, and flcbL in the other. Thus, in one aspect, the 
present invention provides a recombinant expression vector comprising the promoter of 
the flcbO gene of an FK-520 producing organism positioned to transcribe a gene other 
than/kbO. In a preferred embodiment the transcribed gene is an FK-520 PKS gene. In 
another preferred embodiment, the transcribed gene is a gene that encodes a protein 
comprised in a hybrid PKS. 

Heterologous promoters can also be employed and are preferred for use in host 
cells in which the endogenous FK-520 PKS gene promoters do not function or function 
poorly. A preferred heterologous promoter is the actl promoter and its attendant activator 
gene actII-ORF4, which is provided in the pRMl and pRM5 expression vectors, supra. 
This promoter is activated in the stationary phase of growth when secondary metabolites 
are normally synthesized. Other useful Streptomyces promoters include without limitation 
those from the ermE gene and the melCl gene, which act constitutively, and the tipA 
gene and the merA gene, which can be induced at any growth stage. In addition, the T7 
RNA polymerase system has been transferred to Streptomyces and can be employed in 
the vectors and host cells of the invention. In this system, the coding sequence for the T7 
RNA polymerase is inserted into a neutral site of the chromosome or in a vector under the 
control of the inducible merA promoter, and the gene of interest is placed under the 
control of the T7 promoter. As noted above, one or more activator genes can also be 
employed to enhance the activity of a promoter. Activator genes in addition to the actll- 
ORF4 gene discussed above include dnrl, redD, and ptpA genes (see U.S. patent 
application Serial No. 09/1 8 1,833, supra) to activate promoters under their control. 
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In addition to providing recombinant DNA compounds that encode the FK-520 
PKS, the present invention also provides DNA compounds that encode the ethylmalonyl 
CoA and 2-hydroxymalonyl CoA utilized in the synthesis of FK-520. Thus, the present 
invention also provides recombinant host cells that express the genes required for the 
biosynthesis of ethylmalonyl CoA and 2-hydroxymalonyl CoA. Figures 3 and 4 show the 
location of these genes on the cosmids of the invention and the biosynthetic pathway that 
produces ethylmalonyl CoA. 

For 2-hWoxymalonyl CoA biosynthesis, the JkbH,fkbI,jkbJ,and flcbK genes are 
[fficient to confehthis ability on Streptomcyces host cells. For conversion of 2- 
10 hydroxymalonyl to iWthoxymalonyl, the jkbG gene is also employed. While the 
complete coding sequence for JkbHxs provided on the cosmids of the invention, the 
sequence for this gene provided herein may be missing a T residue, based on a 
comparison made with a similar gene cloned from the ansamitocin gene cluster by Dr. H. 
j*? Floss. Where the sequence hereikshows one T, there may be two, resulting in an 

9l 1 5 extension of the JkbH reading framkto encode the amino acid sequence: 
p MTIVKCLVWDLDNTLWRGTVLEto 

| dlawerlerlgvaeyfvlarigw^ksqsvreiatelnfapttiafiddqpa 
\\ evafhlpevrcypaeqaatllslpef^vswdsrrrrlmyqagfardqarea 
2 ysgpdedflrsldlsmtiapageeelsrweltlrtsqmnatgvhysdadlrall 
20 tdpahevlvvtmgdrjgphgavgiilleApstwhlkllatscrvvsfgagatil 

nwltdqgaragahlvajdfrrtdrnrmm 

agverlhlepsarpapttltltaadiapvtvsaa^}. 

For ethylmalonyl CoA biosynthesis, one requires only a crotonyl CoA reductase, 
which can be supplied by the host cell but can also be supplied by recombinant 

25 expression of XhefkbS gene of the present invention. To increase yield of ethylmalonyl 
CoA, one can also express the flcbE andflcbU genes as well. While such production can 
be achieved using only the recombinant genes above, one can also achieve such 
production by placing into the recombinant host cell a large segment of the DNA 
provided by the cosmids of the invention. Thus, for 2-hydroxymalonyl and 2- 

30 methoxymalonyl CoA biosynthesis, one can simply provide the cells with the segment of 
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DNA located on the left side of the FK-520 PKS genes shown in Figure 1 . For 
ethylmalonyl CoA biosynthesis, one can simply provide the cells with the segment of 
DNA located on the right side of the FK-520 PKS genes shown in Figure 1 or, 
alternatively, both the right and left segments of DNA. 

The recombinant DNA expression vectors that encode these genes can be used to 
construct recombinant host cells that can make these important polyketide building 
blocks from cells that otherwise are unable to produce them. For example, Streptomyces 
coelicolor and Streptomyces lividans do not synthesisze ethylmalonyl CoA or 2- 
hydroxymalonyl CoA. The invention provides methods and vectors for constructing 
recombinant Streptomyces coelicolor and Streptomyces lividans that are able to 
synthesize either or both ethylmalonyl CoA and 2-hydroxymalonyl CoA. These host cells 
are thus able to make polyketides, those requiring these substrates, that cannot otherwise 
be made in such cells. 

In a preferred embodiment, the present invention provides recombinant 
Streptomyces host cells, such as S. coelicolor and S. lividans, that have been transformed 
with a recombinant vector of the invention that codes for the expression of the 
ethylmalonyl CoA biosynthetic genes. The resulting host cells produce ethylmalonyl 
CoA and so are preferred host cells for the production of polyketides produced by PKS 
enzymes that comprise one or more AT domains specific for ethylmalonyl CoA. 
Illustrative PKS enzymes of this type include the FK-520 PKS and a recombinant PKS in 
which one or more AT domains is specific for ethylmalonyl CoA. 

In a related embodiment, the present invention provides Streptomyces host cells in 
which one or more of the ethylmalonyl or 2-hydroxymalonyl biosynthetic genes have 
been deleted by homologous recombination or rendered inactive by mutation. For 
example, deletion or inactivation of the fkbG gene can prevent formation of the methoxyl 
groups at C-13 and C-15 of FK-520 (or, in the corresponding FK-506 producing cell, FK- 
506), leading to the production of 1 3 , 1 5-didesmethoxy- 13,1 5-dihydroxy-FK-520 (or, in 
the corresponding FK-506 producing cell, 1 3,1 5-didesmethoxy- 13,1 5 -dihydroxy-FK- 
506). If the fkbG gene product acts on 2-hydroxymalonyl and the resulting 2- 
methoxymalonyl substrate is required for incorporation by the PKS, the AT domains of 
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modules 7 and 8 may bind malonyl CoA and methylmalonyl CoA. Such incorporation 
results in the production of a mixture of polyketides in which the methoxy groups at C-13 
and C-l 5 of FK-520 (or FK-506) are replaced by either hydrogen or methyl. 

This possibility of non-specific binding results from the construction of a hybrid 
PKS of the invention in which the AT domain of module 8 of the FK-520 PKS replaced 
the AT domain of module 6 of DEBS. The resulting PKS produced, in Streptomyces 
lividans, 6-dEB and 2-desmethyl-6-dEB, indicating that the AT domain of module 8 of 
the FK-520 PKS could bind malonyl CoA and methylmalonyl CoA substrates. Thus, one 
could possibly also prepare the 13,15-didesmethoxy-FK-520 and corresponding FK-506 
compounds of the invention by deleting or otherwise inactivating one or more or all of 
the genes required for 2-hydroxymalonyl CoA biosynthesis, i.e., the fkbH,fkbI,fkbJ, and 
fkbK genes. In any event, the deletion or inactivation of one or more biosynthetic genes 
required for ethylmalonyl and/or 2-hydroxymalonyl production prevents the formation of 
polyketides requiring ethylmalonyl and/or 2-hydroxymalonyl for biosynthesis, and the 
resulting host cells are thus preferred for production of polyketides that do not require the 
same. 

The host cells of the invention can be grown and fermented under conditions 
known in the art for other purposes to produce the compounds of the invention. See, e.g., 
U.S. Patent Nos. 5,194,378; 5,1 16,756; and 5,494,820, incorporated herein by reference, 
for suitable fermentation processes. The compounds of the invention can be isolated from 
the fermentation broths of these cultured cells and purified by standard procedures. 
Preferred compounds of the invention include the following compounds: 13-desmethoxy- 
FK-506; 13-desmethoxy-FK-520; 13,15-didesmethoxy-FK-506; 13,15-didesmethoxy- 
FK-520; 1 3-desmethoxy- 1 8-hydroxy-FK-506; 1 3-desmethoxy- 1 8-hydroxy-FK-520; 
13,1 5-didesmethoxy- 1 8-hydroxy-FK-506; and 1 3, 1 5-didesmethoxy- 1 8-hydroxy-FK-520. 
These compounds can be further modified as described for tacrolimus and FK-520 in 
U.S. Patent Nos. 5,225,403; 5,189,042; 5,164,495; 5,068,323; 4,980,466; and 4,920,218, 
incorporated herein by reference. 

Other compounds of the invention are shown in Figure 8, Parts A and B. In Figure 
8, Part A, illustrative C-32-substituted compounds of the invention are shown in two 
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columns under the heading R. The substituted compounds are preferred for topical 
administration and are applied to the dermis for treatment of conditions such as psoriasis. 
In Figure 8, Part B, illustrative reaction schemes for making the compounds shown in 
Figure 8, Part A, are provided. In the upper scheme in Figure 8, Part B, the C-32 
substitution is a tetrazole moiety, illustrative of the groups shown in the left column 
under R in Figure 8, Part A. In the lower scheme in Figure 8, Part B, the C-32 
substitution is a disubstituted amino group, where R 3 and R4 can be any group similar to 
the illustrative groups shown attached to the amine in the right column under R in Figure 
8, Part A. While Figure 8 shows the C-32-substituted compounds in which the C-15- 
methoxy is present, the invention includes these C-32 -substituted compounds in which C- 
15 is ethyl, methyl, or hydrogen. Also, while C-21 is shown as substituted with ethyl or 
allyl, the compounds of the invention includes the C-32-substituted compounds in which 
C-21 is substituted with hydrogen or methyl. 

To make these C-32-substituted compounds, Figure 8, Part B, provides illustrative 
reaction schemes. Thus, a selective reaction of the starting compound (see Figure 8, Part 
B, for an illustrative starting compound) with trifluoromethanesulfonic anhydride in the 
presence of a base yields the C-32 O-triflate derivative, as shown in the upper scheme of 
Figure 8, Part B. Displacement of the triflate with IH-tetrazole or triazole derivatives 
provides the C-32 tetrazole or teiazole derivative. As shown in the lower scheme of 
Figure 8, Part B, reacting the starting compound with p-nitrophenylchloroformate yields 
the correspoinding carbonate, which, upon displacement with an amino compound, 
provides the corresponding carbamate derivative. 

The compounds can be readily formulated to provide the pharmaceutical 
compositions of the invention. The pharmaceutical compositions of the invention can be 
used in the form of a pharmaceutical preparation, for example, in solid, semisolid, or 
liquid form. This preparation contains one or more of the compounds of the invention as 
an active ingredient in admixture with an organic or inorganic carrier or excipient 
suitable for external, enteral, or parenteral application. The active ingredient may be 
compounded, for example, with the usual non-toxic, pharmaceutical^ acceptable carriers 
for tablets, pellets, capsules, suppositories, solutions, emulsions, suspensions, and any 
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other form suitable for use. Suitable formulation processes and compositions for the 
compounds of the present invention are described with respect to tacrolimus in U.S. 
Patent Nos. 5,939,427; 5,922,729; 5,385,907; 5,338,684; and 5,260,301, incorporated 
herein by reference. Many of the compounds of the invention contain one or more chiral 
centers, and all of the stereoisomers are included within the scope of the invention, as 
pure compounds as well as mixtures of stereoisomers. Thus the compounds of the 
invention may be supplied as a mixture of stereoisomers in any proportion. 

The carriers which can be used include water, glucose, lactose, gum acacia, 
gelatin, mannitol, starch paste, magnesium trisilicate, talc, corn starch, keratin, colloidal 
silica, potato starch, urea, and other carriers suitable for use in manufacturing 
preparations, in solid, semi-solid, or liquified form. In addition, auxiliary stabilizing, 
thickening, and coloring agents and perfumes may be used. For example, the compounds 
of the invention may be utilized with hydroxypropyl methylcellulose essentially as 
described in U.S. Patent No. 4,916,138, incorporated herein by reference, or with a 
surfactant essentially as described in EPO patent publication No. 428,169, incorporated 
herein by reference. 

Oral dosage forms may be prepared essentially as described by Hondo et al. 9 
1987, Transplantation Proceedings XIX, Supp. 6: 17-22, incorporated herein by 
reference. Dosage forms for external application may be prepared essentially as described 
in EPO patent publication No. 423,714, incorporated herein by reference. The active 
compound is included in the pharmaceutical composition in an amount sufficient to 
produce the desired effect upon the disease process or condition. 

For the treatment of conditions and diseases relating to immunosuppression or 
neuronal damage, a compound of the invention may be administered orally, topically, 
parenterally, by inhalation spray, or rectally in dosage unit formulations containing 
conventional non-toxic pharmaceutically acceptable carriers, adjuvant, and vehicles. The 
term parenteral, as used herein, includes subcutaneous injections, and intravenous, 
intramuscular, and intrasternal injection or infusion techniques. 

Dosage levels of the compounds of the present invention are of the order from 
about 0.01 mg to about 50 mg per kilogram of body weight per day, preferably from 
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about 0. 1 mg to about 10 mg per kilogram of body weight per day. The dosage levels are 
useful in the treatment of the above-indicated conditions (from about 0.7 mg to about 3.5 
mg per patient per day, assuming a 70 kg patient). In addition, the compounds of the 
present invention may be administered on an intermittent basis, i.e., at semi-weekly, 
weekly, semi-monthly, or monthly intervals. 

The amount of active ingredient that may be combined with the carrier materials 
to produce a single dosage form will vary depending upon the host treated and the 
particular mode of administration. For example, a formulation intended for oral 
administration to humans may contain from 0.5 mg to 5 g of active agent compounded 
with an appropriate and convenient amount of carrier material, which may vary from 
about 5 percent to about 95 percent of the total composition. Dosage unit forms will 
generally contain from about 0.5 mg to about 500 mg of active ingredient. For external 
administration, the compounds of the invention can be formulated within the range of, for 
example, 0.00001% to 60% by weight, preferably from 0.001% to 10% by weight, and 
most preferably from about 0.005% to 0.8% by weight. The compounds and 
compositions of the invention are useful in treating disease conditions using doses and 
administration schedules as described for tacrolimus in U.S. Patent Nos. 5,542,436; 
5,365,948; 5,348,966; and 5,196,437, incorporated herein by reference. The compounds 
of the invention can be used as single therapeutic agents or in combination with other 
therapeutic agents. Drugs that can be usefully combined with compounds of the invention 
include one or more immunosuppressant agents such as rapamycin, cyclosporin A, FK- 
506, or one or more neurotrophic agents. 

It will be understood, however, that the specific dosage level for any particular 
patient will depend on a variety of factors. These factors include the activity of the 
specific compound employed; the age, body weight, general health, sex, and diet of the 
subject; the time and route of administration and the rate of excretion of the drug; 
whether a drug combination is employed in the treatment; and the severity of the 
particular disease or condition for which therapy is sought. 
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A detailed description of the invention having been provided above, the following 
examples are given for the purpose of illustrating the present invention and shall not be 
construed as being a limitation on the scope of the invention or claims. 



Replacement of Methoxyl with Hydrogen or Methyl at C-13 of FK-520 
The C-13 methoxyl group is introduced into FK-520 via an AT domain in 
extender module 8 of the PKS that is specific for hydroxymalonyl and by methylation of 
the hydroxyl group by an S-adenosyl methionine (SAM) dependent methyltransferase. 
1 0 Metabolism of FK-506 and FK-520 primarily involves oxidation at the C-l 3 position into 
an inactive derivative that is further degraded by host P450 and other enzymes. The 
present invention provides compounds related in structure to FK-506 and FK-520 that do 
not contain the C-13 methoxy group and exhibit greater stability and a longer half-life in 
vivo. These compounds are useful medicaments due to their immunosuppressive and 
5 neurotrophic activities, and the invention provides the compounds in purified form and as 
pharmaceutical compositions. 

The present invention also provides the novel PKS enzymes that produce these 
novel compounds as well as the expression vectors and host cells that produce the novel 
PKS enzymes. The novel PKS enzymes include, among others, those that contain an AT 
0 domain specific for either malonyl Co A or methylmalonyl Co A in module 8 of the FK- 
506 and FK-520 PKS. This example describes the construction of recombinant DNA 
compounds that encode the novel FK-520 PKS enzymes and the transformation of host 
cells with those recombinant DNA compounds to produce the novel PKS enzymes and 
t he poly ketides produced thereby. 



(XH / To construct i^expression cassette for performing module 8 AT domain 
replacements in the FK-5gO PKS, a 4.6 kb Sphl fragment from the FK-520 gene cluster 
was cloned into plasmid pLrtmus 38 (a cloning vector available from New England 
Biolabs). The 4.6 kb Sphl fragrWt, which encodes the ACP domain of module 7 
followed by module 8 through the RR domain, was isolated from an agarose gel after 
digesting the cosmid pKOS65-C31 wiVsp/* I. The clone having the insert oriented so 
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the single Sad site was nearest to the Spel end of the polylinker was identified and 

designated as plasmid pKOS60-21-67. To generate appropriate cloning sites, two linkers 

were ligated sequentially as follows. First, a linker was ligated between the Spel and 

Sacl sites to introduce a Bglll site at the 5' end of the cassette, to eliminate interfering 

5 polylinker sites, and to reduce the total insert size to 4.5 kb (the limit of the phage 

KC515). The ligation reactions contained 5 picomolar unphosphorylated linker DNA and 

0.1 picomolar vector DNA, i.e., a 50-fold molar excess of linker to vector. The linker had 

the following sequence: 

5 '-CTAGTGGGCAGATCTGGC AGCT-3 ' 
10 3'-ACCCGTCTAGACCG-5' 

The r esulting plasmid was designated pKOS60-27-l. 

Next, a linker of the following sequence was ligated between the unique SphI and 

'Aflll sites of plasiruVl pKOS60-27-l to introduce an Nsil site at the 3' end of the module 8 

cassette. The linker employed was: 

15 5'-GGGATGCATGGC-3' 

3 ' -GT ACCCCTACQ(TACCGAATT-5 ' 

The resulting plasmid was designated pKOS60-29-55. 
%jJuO&^ To ^to^p-frame insertions of alternative AT domains, sites were engineered at 

the 5' end (Avr II orNhe I) and 3' end (Xho I) of the AT domain using the polymerase 
20 chain reaction (PCR)Vas follows. Plasmid pKOS60-29-55 was used as a template for the 
PCR and sequence 5' to\the AT domain was amplified with the primers SpeBgl-fwd and 
either Avr-rev or Nhe-rev 

SpeBgl-fwd 5 '-CG\CTCACTAGTGGGCAGATCTGG-3 ' 
Avr-rev 5'-CACGCCTAGGCCGGTCGGTCTCGGGCCAC-3 ' 
Nhe-rev 5'-GCGGCTAOCTGCTCGCCCATCGCGGGATGC-3' 
The PCR included, in a 50 ul reaction, 5 ul of lOx Pfu polymerase buffer 
(Stratagene), 5 ul lOx z-dNTP mixture (2 mM dATP, 2 mM dCTP, 2 mM dTTP, 1 mM 
dGTP, 1 mM 7-deaza-GTP), 5 ul DMSO, 2 ul of each primer (10 uM), 1 ul of template 
DNA (0.1 ug/ul), and 1 ul of cloned Pfu polymerase (Stratagene). The PCR conditions 
were 95°C for 2 min., 25 cycles at 95°C for 30 sec, 60°C for 30 sec, and 72°C for 4 
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min., followed by 4 min. at 72°C and a hold at 0°C. The amplified DNA products and 
the Litmus vectors were cut with the appropriate restriction enzymes (BgUl and Avrll or 
Spel and Nhel), and cloned into either pLitmus 28 or pLitmus38 (New England Biolabs), 
respectively, to generate the constructs designated pKOS60-37-4 and pKOS60-37-2, 
5 respectively. 

Plasmid pKO§60-29-55 was again used as a template for PCR to amplify 
sequence 3' to the AT domain using the primers BsrXho-fwd and NsiAfl-rev: 

BsrXho-fwd 5'-GATGTACAGCTCGAGTCGGCACGCCCGGCCGCATC-3' 
NsiAfl-rev 5 ' -CGACTOACTTAAGCC ATGC ATCC-3 ' 
0 PCR conditions were as described above. The PCR fragment was cut with BsrGl 

andAflll, gel isolated, and ligated into pKOS60-37-4 cut with Aspl 18 andAflll and 
inserted into pKOS60-37-2 cut with BsrGl and Aflll, to give the plasmids pKOS60-39-l 
and pKOS60-39-13, respectively. These two plasmids can be digested with Avrll and 
Xhol or Nhel and Xhol, respectively, to insert heterologous AT domains specific for 
5 malonyl, methylmalonyl, ethylmalonyl, or other extender units. 
\&j$y Malony\and methylmalonyl-specific AT domains were cloned from the 

rapamycin clustering PCR amplification with a pair of primers that introduce an Avrll 
or Nhel site at the 5' end and an Xhol site at the 3' end. The PCR conditions were as 
given above and the primer sequences were as follows: 

20 \ 

RATN 1 5 ' -ATCCTAGGOGGGCRGG YGTGTCGTCCTTCGG-3 ' 

(3' end of Rap KS sequence anii universal for malonyl and methylmalonyl CoA), 

RATMN2 5 ' -ATGCTAGCCGClSXjCGTTCCCCGTCTTCGCGCG-3 ' 

(Rap AT shorter version 5'- sequence and specific for malonyl CoA), 

25 RATMMN2 5 ' - ATGCTAGCGGATTCGTCGGTGGTGTTCGCCG A-3 ' 

(Rap AT shorter version 5'- sequence and specific for methylmalonyl CoA), and 
RATC 5 ' -ATCTCGAGCC AGTASGGCTGGTG YTGGAAGG-3 ' 
(Rap DH 5'- sequence and universal for malonyl and methylmalonyl CoA). 
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MMN2 - Nhel 



Nl -Avrll; MN2 - Nhel 







AT 











Any Rap Module 



Xho\-C 



ana / 



Because of the high sequence similarity in each module of the rapamycin cluster, 
each primer was expected to prime any of the AT domains. PCR products representing 
ATs specific for malonyl or methylmalonyl extenders were identified by sequencing 
individual cloned PCR products. Sequencing also confirmed that the chosen clones 
contained no cloning artifacts. Examples of hybrid modules with the rapamycin ATI 2 
and AT 13 domains are shown in a separate figure. 

The Avr\l-Xf}b$. restriction fragment that encodes module 8 of the FK-520 PKS 
with the endogenous ATsJomain replaced by the AT domain of module 12 of the 
rapamycin PKS has the DN^ sequence and encodes the amino acid sequence shown 
below. The AT of rap module^ is specific for incorporation of malonyl units. 

AGATCTGGCAGCTCGCCGAAGCGCTGCTGACGCTCGTCCGGGAGAGCACC 5 0 

IWQLAE'ALLTLVREST 
GCCGCCGTGCTCGGCCACGTGGGTGGCGAGGACATCCCCGCGACGGCGGC 100 

AAVLGHVGGEDI PATAA 
GTTCAAGGACCTCGGCATCGACTCGCTCACCGCGGTCCAGCTGCGCAACG 150 

FKDLGI DSLTAVQLRN 
CCCTCACCGAGGCGACCGGTGTGCGGCTGAACGCCACGGCGGTCTTCGAC 200 
ALTEATGVRLNATAVFD 
TTCCCGACCCCGCACGTGCTCGCCGGGAAGCTCGGCGACGAACTGACCGG 250 

FPTPHVLAGKLGDELTG 
CACCCGCGCGCCCGTCGTGCCCCGGACCGCGGCC ACGGCCGGTGCGCACG 300 

TRAPVVPRTAATAGAH 
ACGAGCCGCTGGCGATCGTGGGAATGGCCTGCCGGCTGCCCGGCGGGGTC 350 
DEPLAIVGMACRLPGGV 
GCGTCACCCGAGGAGCTGTGGCACCTCGTGGCATCCGGCACCGACGCCAT 400 

ASPEELWHLVASGTDAI 
CACGGAGTTCCCGACGGACCGCGGCTGGGACGTCGACGCGATCTACGACC 450 

TEFPTDRGWDVDAI YD 
CGGACCCCGACGCGATCGGCAAGACCTTCGTCCGGCACGGTGGCTTCCTC 500 
PDPDAIGKTFVRHGGFL 
ACCGGCGCGACAGGCTTCGACGCGGCGTTCTTCGGCATCAGCCCGCGCGA 550 

T G A T G F D A A F F G I S P R E 
GGCCCTCGCGATGGACCCGCAGCAGCGGGTGCTCCTGGAGACGTCGTGGG 600 

ALAMDPQQRVLLETSW 
AGGCGTTCGAAAGCGCCGGCATCACCCCGGACTCGACCCGCGGCAGCGAC 650 
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EAFESAGITPDSTRGSD 
ACCGGCGTGTTCGTCGGCGCCTTCTCCTACGGTTACGGCACCGGTGCGGA 700 

TGVFVGAFSYGYGTGAD 
CACCGACGGCTTCGGCGCGACCGGCTCGCAGACCAGTGTGCTCTCCGGCC 7 50 
5 T DG FGA T GSQT SV L S G 

GGCTGTCGTACTTCTACGGTCTGGAGGGTCCGGCGGTCACGGTCGACACG 800 
RLSYFYGLEGPAVTVDT 
GCGTGTTCGTCGTCGCTGGTGGCGCTGCACCAGGCCGGGCAGTCGCTGCG 850 
AC S SSLVALHQAGQS LR 
10 CTCCGGCGAATGCTCGCTCGCCCTGGTCGGCGGCGTCACGGTGATGGCGT 900 
SGECSLALVGGVTVM A 
CTCCCGGCGGCTTCGTGGAGTTCTCCCGGCAGCGCGGCCTCGCGCCGGAC 950 
SPGGFVEFSRQRGLAPD 
GGCCGGGCGAAGGCGTTCGGCGCGGGTGCGGACGGCACGAGCTTCGCCGA 1000 
15 GRAKAFGAGADGTS FAE 

GGGTGCCGGTGTGCTGATCGTCGAGAGGCTCTCCGACGCCGAACGCAACG 1050 

GAGVLI VERLS DAE RN 
GTCACACCGTCCTGGCGGTCGTCCGTGGTTCGGCGGTCAACCAGGATGGT 1100 
GHTVLAVVRGSAVNQDG 
20 GCCTCCAACGGGCTGTCGGCGCCGAACGGGCCGTCGC AGGAGCGGGTGAT 1150 
ASNGLSAPNGPSQERVI 
CCGGCAGGCCCTGGCCAACGCCGGGCTCACCCCGGCGGACGTGGACGCCG 1200 

RQALANAGLTPADV DA 
TCGAGGCCCACGGCACCGGCACCAGGCTGGGCGACCCCATCGAGGCACAG 1250 
25 VE AHGTG TR L.GDPIEAQ 

GCGGTACTGGCCACCTACGGACAGGAGCGCGCCACCCCCCTGCTGCTGGG 1300 

AVLATYGQERATPLLLG 
CTCGCTGAAGTCCAACATCGGCCACGCCCAGGCCGCGTCCGGCGTCGCCG 1350 
SLKSNIGHAQAASGVA 
30 GCATCATCAAGATGGTGCAGGCCCTCCGGCACGGGGAGCTGCCGCCGACG 14 00 
GI I KMV-QALRHGELPPT 
CTGC ACGCCG ACGAGCCGTCGCCGCACGTCGACTGGACGGCCGGCGCCGT 14 50 

LHADEPS PHVDWTAGAV 
CGAACTGCTGACGTCGGCCCGGCCGTGGCCCGAGACCGACCGGCCTAGGC 1500 
35 ELLTSARPWPETDRPR 

GGGCAGGCGTGTCGTCCTTCGGGATCAGTGGCACCAACGCCCACGTCATC 1550 
RAGVSS FGISGTNAHVI 
CTGGAAAGCGCACCCCCCACTCAGCCTGCGGACAACGCGGTGATCGAGCG 1600 
LESAPPTQPADNAVIER 
40 GGCACCGGAGTGGGTGCCGTTGGTGATTTCGGCCAGGACCCAGTCGGCTT 1650 
APEWVPLVISARTQS A 
TGACTGAGCACGAGGGCCGGTTGCGTGCGTATCTGGCGGCGTCGCCCGGG 1700 
LTEHEGRLRAYLAAS PG 
GTGGATATGCGGGCTGTGGCATCGACGCTGGCGATGACACGGTCGGTGTT 1750 
45 VDMRAVASTLAMTRSVF 

CGAGCACCGTGCCGTGCTGCTGGGAGATGACACCGTCACCGGCACCGCTG 1800 

EHRAVLLGDDTVTGTA 
TGTCTGACCCTCGGGCGGTGTTCGTCTTCCCGGGACAGGGGTCGCAGCGT 1850 
VSDPRAVFVFPGQGSQR 
50 GCTGGCATGGGTGAGGAACTGGCCGCCGCGTTCCCCGTCTTCGCGCGGAT 1900 
AGMGEELAAAFPVFARI 
CCATCAGCAGGTGTGGGACCTGCTCGATGTGCCCGATCTGGAGGTGAACG 1950 

HQQVWDLLDVPDLEVN 
AGACCGGTTACGCCCAGCCGGCCCTGTTCGCAATGCAGGTGGCTCTGTTC 2000 
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ETGYAQPALFAMQVALF 
GGGCTGCTGGAATCGTGGGGTGTACGACCGGACGCGGTGATCGGCCATTC 2050 

GLLESW GVRPDAVIGHS 
GGTGGGTGAGCTTGCGGCTGCGTATGTGTCCGGGGTGTGGTCGTTGGAGG 2100 
5 VGELAAAYVSGVWSLE 

ATGCCTGCACTTTGGTGTCGGCGCGGGCTCGTCTGATGCAGGCTCTGCCC 2150 
DACTLVSARARLMQALP 
GCGGGTGGGGTGATGGTCGCTGTCCCGGTCTCGGAGGATGAGGCCCGGGC 2200 
AGGVMVAVPVSEDEARA 
1 0 CGTGCTGGGTGAGGGTGTGGAGATCGCCGCGGTCAACGGCCCGTCGTCGG 2250 
VLGEGVEIAAVNGPSS 
TGGTTCTCTCCGGTGATGAGGCCGCCGTGCTGCAGGCCGCGGAGGGGCTG 2300 
VVLSGDEAAVLQAAEGL 
GGGAAGTGGACGCGGCTGGCGACCAGCCACGCGTTCCATTCCGCCCGTAT 2350 
15 GKWTRLATSHAFHSARM 

GGAACCCATGCTGGAGGAGTTCCGGGCGGTCGCCGAAGGCCTGACCTACC 24 00 

EPMLEEFRAVAEGLTY 
GGACGCCGCAGGTCTCCATGGCCGTTGGTGATCAGGTGACCACCGCTGAG 24 50 
RTPQVSMAVGDQVTTAE 
20 TACTGGGTGCGGCAGGTCCGGGACACGGTCCGGTTCGGCGAGCAGGTGGC 2500 
YWVRQVR DTVRFGEQVA 
CTCGTACGAGGACGCCGTGTTCGTCGAGCTGGGTGCCGACCGGTCACTGG 2550 

SYEDAVFVELGADRSL 
CCCGCCTGGTCGACGGTGTCGCGATGCTGCACGGCGACCACGAAATCCAG 2 600 
25 A RLVDGV AMLH GD HEIQ 

GCCGCGATCGGCGCCCTGGCCCACCTGTATGTCAACGGCGTCACGGTCGA 2 650 

AAIGALAHLYVNGVTVD 
CTGGCCCGCGCTCCTGGGCGATGCTCCGGCAACACGGGTGCTGGACCTTC 2700 
WPALLGDAPATRVLDL 
30 CGACATACGCCTTCCAGCACCAGCGCTACTGGCTCGAGTCGGCACGCCCG 27 50 
PTYAFQHQRYWLESARP 
GCCGCATCCGACGCGGGCCACCCCGTGCTGGGCTCCGGTATCGCCCTCGC 2800 

AAS DAG H PVLGS'GIALA 
CGGGTCGCCGGGCCGGGTGTTCACGGGTTCCGTGCCGACCGGTGCGGACC 2850 
35 GSPGRVFTGSVPTGAD 

GCGCGGTGTTCGTCGCCGAGCTGGCGCTGGCCGCCGCGGACGCGGTCGAC 2900 
RAVFVAELALAAADAVD 
TGCGCCACGGTCGAGCGGCTCGACATCGCCTCCGTGCCCGGCCGGCCGGG 2 950 
CATVERLDIASVPGRPG 
40 CCATGGCCGGACGACCGTACAGACCTGGGTCGACGAGCCGGCGGACGACG 3000 
HGRTTVQTWVD EPADD 
GCCGGCGCCGGTTCACCGTGCACACCCGC ACCGGCGACGCCCCGTGGACG 3050 
GRRRFTVHTRTGDAPWT 
CTGCACGCCGAGGGGGTGCTGCGCCCCCATGGCACGGCCCTGCCCGATGC 3100 
45 LHAEGVLRPHGTALPDA 

GGCCGACGCCGAGTGGCCCCCACCGGGCGCGGTGCCCGCGGACGGGCTGC 3150 

ADAEWPPPGAVP ADGL 
CGGGTGTGTGGCGCCGGGGGGACCAGGTCTTCGCCGAGGCCGAGGTGGAC 3200 
PGVWRRG DQVFAEAEVD- 
50 GGACCGGACGGTTTCGTGGTGCACCCCGACCTGCTCGACGCGGTCTTCTC 3250 
GPDGFVVHPDLLDAVFS 
CGCGGTCGGCGACGGAAGCCGCCAGCCGGCCGGATGGCGCGACCTGACGG 3300 

AVGDGSRQPAGWRDLT 
TGCACGCGTCGGACGCCACCGTACTGCGCGCCTGCCTCACCCGGCGCACC 3350 
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V H A S DATVLRACLTRRT 
GACGGAGCCATGGGATTCGCCGCCTTCGACGGCGCCGGCCTGCCGGTACT 3400 

DGAMGFAA FD. GAGLPVL 
CACCGCGGAGGCGGTGACGCTGCGGGAGGTGGCGTCACCGTCCGGCTCCG 3450 
5 TAEAVTLREVAS PSGS 

AGGAGTCGGACGGCCTGCACCGGTTGGAGTGGCTCGCGGTCGCCGAGGCG 3500 
EES DGLHRLEWLAVAEA 
GTCTACGACGGTGACCTGCCCGAGGGACATGTCCTGATCACCGCCGCCCA 3550 
VYDGDLPEGHVLITAAH 
10 CCCCGACGACCCCGAGGACATACCCACCCGCGCCCACACCCGCGCCACCC 3600 
PDDPEDI PTRAHTRAT 
GCGTCCTGACCGCCCTGCAACACCACCTCACCACCACCGACCACACCCTC 3650 
RVLTALQHHLTTTDHTL 
ATCGTCCACACCACCACCGACCCCGCCGGCGCCACCGTCACCGGCCTCAC 37 00 
15 IVHTTTDPAGATVTGLT 

CCGCACCGCCCAGAACGAACACCCCCACCGCATCCGCCTCATCGAAACCG 3750 

RTAQNEHPHRIRLIET 
ACCACCCCCACACCCCCCTCCCCCTGGCCCAACTCGCCACCCTCGACCAC 3800 
DHPHTPL PLA QLATLDH 
20 CCCCACCTCCGCCTCACCCACCACACCCTCCACCACCCCCACCTCACCCC 3850 
PHLRLTHHTLHHPHLTP 
HP CCTCCACACCACCACCCCACCCACCACCACCCCCCTCAACCCCGAACACG 3900 

Q LHTTTPPTTTPLNPEH 
IjJ CCATCATCATCACCGGCGGCTCCGGCACCCTCGCCGGCATCCTCGCCCGC 3950 

}U 25 All. ITG.GSGTLAGILAR 

" ■ CACCTGAACCACCCCCACACCTACCTCCTCTCCCGCACCCCACCCCCCGA 4000 

' HLNHPHTYLLSRTPPPD 
L CGCCACCCCCGGCACCCACCTCCCCTGCGACGTCGGCGACCCCCACCAAC 4 050 

J~ ATPGTHLPCDVGDPHQ 

30 TCGCCACCACCCTCACCCACATCCCCCAACCCCTCACCGCCATCTTCCAC 4100 
[w LATT LTH I PQPLTAIFH 

^4 ACCGCCGCCACCCTCGACGACGGCATCCTCCACGCCCTCACCCCCGACCG 4150 

O TAATLDDGILHALTPDR 
H CCTCACCACCGTCCTCCACCCCAAAGCCAACGCCGCCTGGCACCTGCACC 4200 

35 LTTVLH PKANAAWHLH 

ACCTCACCCAAAACCAACCCCTCACCCACTTCGTCCTCTACTCCAGCGCC 4 250 
HLTQNQPLTHFVLYSSA 
GCCGCCGTCCTCGGCAGCCCCGGACAAGGAAACTACGCCGCCGCCAACGC 4 300 
AAVLGS PGQGNYAAA.NA 
40 CTTCCTCGACGCCCTCGCCACCCACCGCCACACCCTCGGCCAACCCGCCA 4 350 
FLD.ALATHRHTLG QPA 
CCTCCATCGCCTGGGGCATGTGGCACACCACCAGCACCCTCACCGGACAA 4 400 
TSIAWGMWHTTSTLTGQ 
CTCGACGACGCCGACCGGGACCGCATCCGCCGCGGCGGTTTCCTCCCGAT 44 50 
45 LDDADR DRIRRGGFLPI 
C AC GG AC G ACGAGGG C AT GGGGATG CAT 
T D D E G 



The Avrll-xhol restriction fragment that encodes module 8 of the FK-520 PKS 
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methylmalonyl CoA) of the rapamycin PKS has the DNA sequence and encodes the 
amino acid sequence shown below. 

AGATCTGGCAGCTCGCCGAAGCGCTGCTGACGCTCGTCCGGGAGAGCACC 5 0 
QLAEALLTLVREST 
5 GCCGCCGTGCTCGGCCACGTGGGTGGCGAGGACATCCCCGCGACGGCGGC 100 
AAV L G H V G G E D I PAT AA 
GTTCAAGGACCTCGGCATCGACTCGCTCACCGCGGTCCAGCTGCGCAACG 150 

F K. D L G I DSLTAVQLRN 
CCCTCACCGAGGCGACCGGTGTGCGGCTGAACGCCACGGCGGTCTTCGAC 200 
10 ALTEATGVRLNATAVFD 

TTCCCGACCCCGCACGTGCTCGCCGGGAAGCTCGGCGACGAACTGACCGG 250 

FPTPHVLAGKLGDELTG 
CACCCGCGCGCCCGTCGTGCCCCGGACCGCGGCCACGGCCGGTGCGCACG 300 
TRAPVVPRTAATAGAH 
1 5 ACGAGCCGCTGGCGATCGTGGGAATGGCCTGCCGGCTGCCCGGCGGGGTC 350 
DE PLAI VGMACRLPGGV 
O GCGTCACCCGAGGAGCTGTGGCACCTCGTGGCATCCGGCACCGACGCCAT 4 00 

*0 AS PEELWHLVASGTDAI 

CACGGAGTTCCCGACGGACCGCGGCTGGGACGTCGACGCGATCTACGACC 4 50 
j£ 20 TEFPT.DRGWDVDAIYD 
Q CGGACCCCGACGCGATCGGCAAGACCTTCGTCCGGCACGGTGGCTTCCTC 500 

|j P D P DA I G K T F.V RHGG F L 

J", ACCGGCGCGACAGGCTTCGACGCGGCGTTCTTCGGCATCAGCCCGCGCGA 550 

' ' TGATGFDA AFFGISPRE 

91 25 GGCCCTCGCGATGGACCCGCAGCAGCGGGTGCTCCTGGAGACGTCGTGGG 600 
L ALAMDPQQRVLLET SW 

Q AGGCGTTCGAAAGCGCCGGCATCACCCCGGACTCGACCCGCGGCAGCGAC 650 

09 EAFESAGITPDSTRGSD 
Hj ACCGGCGTGTTCGTCGGCGCCTTCTCCTACGGTTACGGCACCGGTGCGGA 700 

%J 30 TGVFVGAFSYGYGTGAD 
p CACCGACGGCTTCGGCGCGACCGGCTCGCAGACCAGTGTGCTCTCCGGCC 7 50 

y; TDGFGATGSQTSVLSG 

GGCTGTCGTACTTCTACGGTCTGGAGGGTCCGGCGGTCACGGTCGACACG 800 
RLSYFYGLEGPAVT VDT 
35 GCGTGTTCGTCGTCGCTGGTGGCGCTGCACCAGGCCGGGCAGTCGCTGCG 850 
ACSSSLVALHQAGQSLR 
CTCCGGCGAATGCTCGCTCGCCCTGGTCGGCGGCGTCACGGTGATGGCGT 900 

SGECSLALVGGVTVMA 
CTCCCGGCGGCTTCGTGGAGTTCTCCCGGCAGCGCGGCCTCGCGCCGGAC 950 
40 S PG G FVE F S RQRGLAP D 

GGCCGGGCGAAGGCGTTCGGCGCGGGTGCGGACGGCACGAGCTTCGCCGA 1000 

GRAKAFGAGADGTS FAE 
GGGTGCCGGTGTGCTGATCGTCGAGAGGCTCTCCGACGCCGAACGC AACG 1050 
GAGVLIVERLSDAERN 
45 GTCACACCGTCCTGGCGGTCGTCCGTGGTTCGGCGGTCAACCAGGATGGT 1100 
GHTVLAVVRGSAVNQDG 
GCCTCCAACGGGCTGTCGGCGCCGAACGGGCCGTCGCAGGAGCGGGTGAT 1150 

ASNGLSAPNGPSQERVI 
CCGGCAGGCCCTGGCCAACGCCGGGCTCACCCCGGCGGACGTGGACGCCG 1200 
50 RQALANAGLTPADVDA 

TCG AGGCCC ACGGC ACCGGCACCAGGCTGGGCGACCCCATCG AGGC ACAG 1250 
VEAHGTGTRLGDPIEAQ 
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GCGGTACTGGCCACCTACGGACAGGAGCGCGCCACCCCCCTGCTGCTGGG 1300 

AVLATYGQERAT PLLLG 
CTCGCTGAAGTCCAACATCGGCCACGCCCAGGCCGCGTCCGGCGTCGCCG 1350 
S LKSNI GHAQAAS.GVA 
5 GCATCATCAAGATGGTGCAGGCCCTCCGGCACGGGGAGCTGCCGCCGACG 14 00 
GI I KMVQALRHGELPPT 
CTGCACGCCGACGAGCCGTCGCCGCACGTCGACTGGACGGCCGGCGCCGT 1 4 50 

LHADEPS PHVDWTAGAV 
CGAACTGCTGACGTCGGCCCGGCCGTGGCCCGAGACCGACCGGCCTAGGC 1500 
10 ELLTSARPWPETDRPR 

GGGCGGGCGTGTCGTCCTTCGGAGTCAGCGGCACCAACGCCCACGTCATC 1550 
RAG VSSFGVSGTNAHVI 
CTGGAGAGCGCACCCCCCGCTCAGCCCGCGGAGGAGGCGCAGCCTGTTGA .1600 
LESAPPAQPAEEAQPVE 
1 5 GACGCCGGTGGTGGCCTCGGATGTGCTGCCGCTGGTGAT ATCGGCCAAGA 1650 
T PVVA S DVL PLVI SAK 
CCCAGCCCGCCCTGACCGAACACGAAGACCGGCTGCGCGCCTACCTGGCG 1700 
f~ TQPALTEHEDRLRAYLA 
^ GCGTCGCCCGGGGCGGATATACGGGCTGTGGCATCGACGCTGGCGGTGAC 17 50 

¥ 20 AS PGADI RAVASTLAVT 
*J ACGGTCGGTGTTCGAGCACCGCGCCGTACTCCTTGGAGATGACACCGTCA 1800 

=jp RSVFEHRAVLLGDDTV 
O CCGGCACCGCGGTGACCGACCCCAGGATCGTGTTTGTCTTTCCCGGGCAG 1850 

T G T A V T D P R I V F.V F P G Q 
25 GGGTGGCAGTGGCTGGGGATGGGCAGTGCACTGCGCGATTCGTCGGTGGT 1900 
GWQWLGMGSALRDSSVV 
GTTCGCCGAGCGGATGGCCGAGTGTGCGGCGGCGTTGCGCGAGTTCGTGG 1950 

FAE RMAECAAALREFV 
ACTGGGATCTGTTCACGGTTCTGGATGATCCGGCGGTGGTGGACCGGGTT 2000 
30 DW D L FTV L D D PAVVDRV 

GATGTGGTCCAGCCCGCTTCCTGGGCGATGATGGTTTCCCTGGCCGCGGT 2050 

DVVQ P A S WA MMVS L A A V 
GTGGCAGGCGGCCGGTGTGCGGCCGGATGCGGTGATCGGCCATTCGCAGG 2100 
WQAAGVRPDAVIGHSQ 
35 GTGAGATCGCCGCAGCTTGTGTGGCGGGTGCGGTGTCACTAGGCGATGCC 2150 
GEIAAA CVAGAVSLRDA 
GCCCGGATCGTGACCTTGCGCAGCCAGGCGATCGCCCGGGGCCTGGCGGG 2200 

ARIVTLRSQAIARGLAG 
CCGGGGCGCGATGGCATCCGTCGCCCTGCCCGCGCAGGATGTCGAGCTGG 2250 
40 RGAMASVALPAQDVEL 

TCGACGGGGCCTGGATCGCCGCCCACAACGGGCCCGCCTCCACCGTGATC 2300 
VDGAWIAAHNGPASTVI 
GCGGGCACCCCGGAAGCGGTCGACCATGTCCTCACCGCTCATGAGGCACA 2350 
AGT PEAVDHVLTAHEAQ 
45 AGGGGTGCGGGTGCGGCGGATCACCGTCGACTATGCCTCGCACACCCCGC 2400 
GVRVRRI TVDYASHTP 
ACGTCGAGCTGATCCGCGACGAACTACTCGACATCACTAGCGACAGCAGC 2450 
HVELIRDELLDITSDSS 
TCGCAGACCCCGCTCGTGCCGTGGCTGTCGACCGTGGACGGCACCTGGGT 2500 
50 SQTPLV PWLSTVDG TWV 

CGACAGCCCGCTGGACGGGGAGTACTGGTACCGGAACCTGCGTGAACCGG 2550 

DSPLDGEYWYRNLREP 
TCGGTTTCCACCCCGCCGTCAGCCAGTTGCAGGCCCAGGGCGACACCGTG 2600 
VGFHPAVSQLQAQGDTV 
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TTCGTCGAGGTCAGCGCCAGCCCGGTGTTGTTGCAGGCGATGGACGACGA 2650 

FVEVSAS PVLLQAMDDD 
TGTCGTCACGGTTGCCACGCTGCGTCGTGACGACGGCGACGCCACCCGGA 2700 
V V T VAT L RR D DG DAT R 
5 TGCTCACCGCCCTGGCACAGGCCTATGTCCACGGCGTCACCGTCGACTGG 2750 
MLTALAQAYVHGVTVDW 
CCCGCCATCCTCGGCACCACCACAACCCGGGTACTGGACCTTCCGACCTA 2800 

PA I LGTT TTRVLD LPTY 
CGCCTTCCAACACCAGCGGTACTGGCTCGAGTCGGCACGCCCGGCCGCAT 2850 
10 AFQHQRYWLESAR. PAA 

CCGACGCGGGCCACCCCGTGCTGGGCTCCGGTATCGCCCTCGCCGGGTCG 2 900 
SDAGHPVLGSGIALAGS 
CCGGGCCGGGTGTTCACGGGTTCCGTGCCGACCGGTGCGGACCGCGCGGT 2950 
PGRVFTG SVPTGADRAV 
15 GTTCGTCGCCGAGCTGGCGCTGGCCGCCGCGGACGCGGTCGACTGCGCCA 3000 
FVAELALAAADAVDCA 
CGGTCGAGCGGCTCGACATCGCCTCCGTGCCCGGCCGGCCGGGCCATGGC 3050 
TVERLDIASVPGRPGHG 
CGGACGACCGTACAGACCTGGGTCGACGAGCCGGCGGACGACGGCCGGCG 3100 
d3 20 RTTVQTWVDEPADDGRR 

CCGGTTCACCGTGCACACCCGCACCGGCGACGCCCCGTGGACGCTGCACG 3150 
^ RFTVHTRTGDAPWTLH 
0 CCGAGGGGGTGCTGCGCCCCCATGGCACGGCCCTGCCCGATGCGGCCGAC 3200 

yj AEGV LRPHGTALPDA AD 

25 GCCGAGTGGCCCCCACCGGGCGCGGTGCCCGCGGACGGGCTGGCGGGTGT 3250 
m AEWP P PG A 'VPADG LPGV 

J ' . GTGGCGCCGGGGGGACCAGGTCTTCGCCGAGGCCGAGGTGGACGGACCGG 3300 

L WRRGDQVFA EAEVDGP 

53 ACGGTTTCGTGGTGCACCCCGACCTGCTCGACGCGGTCTTCTCCGCGGTC 3350 

CO 30 DGFVVHPDLLDAVFSAV 
fU GGCGACGGAAGCCGCCAGCCGGCCGGATGGCGCGACCTGACGGTGCACGC 34 00 

SJ G .DGSRQPAGWRDLTVHA 

O GTCGGACGCCACCGTACTGCGCGCCTGCCTCACCCGGGGCACCGACGGAG 3450 

SDATVLRAC LTRRTDG 
3 5 CCATGGGATTCGCCGCCTTCGACGGCGCCGGCCTGCCGGTACTCACCGCG 3500 
AMGFAAFDGAGLPVLTA 
GAGGCGGTGACGCTGCGGGAGGTGGCGTCACCGTCCGGCTCCGAGGAGTC 3550 

EAVTLREVASPSGSEES 
GGACGGCCTGCACGGGTTGGAGTGGCTCGCGGTCGCCGAGGCGGTCTACG 3600 
40 DGLHRLEWLAVAEAVY 

ACGGTGACCTGCCCGAGGGACATGTCCTGATCACCGCCGCCCACCCCGAC 3650 
DGDLPEGHVLITAAHPD 
GACCCCGAGGACATACCCACCCGCGCCCACACCCGCGCCACCCGCGTCCT 3700 
DPEDI PTRAH.TRATRVL 
45 GACCGCCCTGCAACACCACCTCACCACCACCGACCACACCCTCATCGTCC 3750 
TALQHHLTTTDHTLIV 
ACACCACCACCGACCCCGCCGGCGCCACCGTCACCGGCCTCACCCGCACC 3800 
HTTTDPAG ATVT .GLTRT 
GCCCAGAACGAACACCCCCACCGCATCCGCCTCATCGAAACCGACCACCC 3850 
50 AQNEHPHRI RLIETDHP 

CCACACCCCCCTCCCCCTGGCCCAACTCGCCACCCTCGACCACCCCCACC 3900 

HTPLPLAQLATLDHP H 
TCCGCCTCACCCACCACACCCTCCACCACCCCCACCTCACCCCCCTCCAC 3950 
LRLTHHTLHHPHLTP LH 
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ACCACCACCCCACCCACCACCACCCCCCTCAACCCCGAACACGCCATCAT 4 000 

TTTPPTTTPLNPEHAI I 
CATCACCGGCGGCTCCGGCACCCTCGCCGGCATCCTCGCCCGCCACCTGA 4 050 
ITGGSGTLAGI LARHL 
5 ACCACCCCCACACCTACCTCCTCTCCCGCACCCCACCCCCCGACGCCACC 4100 
NHPHTYLLSRTPPPDAT 
CCCGGCACCCACCTCCCCTGCGACGTCGGCGACCCCCACCAACTCGCCAC 4150 

PGT H L PC.DV G D P HQ L AT 
CACCCTCACCCACATCCCCCAACCCCTCACCGCCATCTTCCACACCGCCG 4 200 
10 TLTHIPQPLTAIFHTA 

CCACCCTCGACGACGGCATCCTCCACGCCCTCACCCCCGACCGCCTCACC 4 250 
ATLD DGILHALTPDRLT 
ACCGTCCTCCACCCCAAAGCCAACGCCGCCTGGCACCTGCACCACCTCAC 4 300 
TVLHPKANAAWHLHHLT 
15 CCAAAACCAACCCCTCACCCACTTCGTCCTCTACTCCAGCGCCGCCGCCG 4 350 
QNQPLTHFVLYSSAAA 
TCCTCGGCAGCCCCGGACAAGGAAACTACGCCGCCGCCAACGCCTTCCTC 4 400 
VLGSPGQGNYAAANAFL 
GACGCCCTCGCCACCCACCGCCACACCCTCGGCCAACCCGCCACCTCCAT 4 450 
*j 20 DALATHRHTLGQPATS I 
S CGCCTGGGGCATGTGGCACACCACCAGCACCCTCACCGGACAACTCGACG 4 500 

=P AWGMWHTTSTLTGQLD 
O ACGCCGACCGGGACCGCATCCGCCGCGGCGGTTTCCTCCCGATCACGGAC 4 550 

UJ DADR D RI R R G G F L P I T . D 

l& ■ 25 G AC GAG GG C AT GGGG AT GC AT 
m D E G 



U-<4> 



The NheU>3(hol restriction fragment that encodes module 8 of the FK-520 PKS 
r /With the endogenous AT domain replaced by the AT domain of module 12 (specific for 
Sj 30 malonyl CoA) of the rapamj(cin PKS has the DNA sequence and encodes the amino acid 
sequence shown below. 



AGATCTGGCAGCTCGCCGAAGCGCTGCTGACGCTCGTCCGGGAGAGCACC 5 0 

QLAEALLTLVREST 
GCCGCCGTGCTCGGCCACGTGGGTGGCGAGGACATCCCCGCGACGGCGGC 100 
35 AAVLGHVGGEDI. PATAA 

GTTCAAGGACCTCGGCATCGACTCGCTCACCGCGGTCCAGCTGCGCAACG 150 

FKDLGIDS LTAVQLRN 
CCCTCACCGAGGCGACCGGTGTGCGGCTGAACGCCACGGCGGTCTTCGAC 200 
ALTEATGVRLNATAVFD 
40 TTCCCGACCCCGCACGTGCTCGCCGGGAAGCTCGGCGACGAACTGACCGG 250 
FPTPHVLAGKLGDELTG 
CACCCGCGCGCCCGTCGTGCCCCGGACCGCGGCCACGGCCGGTGCGCACG 300 

TRAPVVPRTAATAGA H 
ACGAGCCGCTGGCGATCGTGGGAATGGCCTGCCGGCTGCCCGGCGGGGTC 350 
45 DEPLAIVGMACRLPGGV 

GCGTCACCCGAGGAGCTGTGGCACCTCGTGGCATCCGGCACCGACGCCAT 4 00 

ASPEELWHLVASGTDAI 
CACGGAGTTCCCGACGGACCGCGGCTGGGACGTCGACGCGATCTACGACC 4 50 
TEFPTDRGWDVDAIYD 
50 CGGACCCCGACGCGATCGGCAAGACCTTCGTCCGGCACGGTGGCTTCCTC 500 



dc- 176500 




Atty Dkt: 300622002600 

-97- 



PDPDAIGKTFVRHGGFL 
ACCGGCGCGACAGGCTTCGACGCGGCGTTCTTCGGCATCAGCCCGCGCGA 550 

TGATGFDAAFFGI S PRE 
GGCCCTCGCGATGGACCCGCAGCAGCGGGTGCTCCTGGAGACGTCGTGGG 600 
5 ALAMDPQQRVLLETSW 

AGGCGTTCGAAAGCGCCGGCATCACCCCGGACTCGACCCGCGGCAGCGAC 650 
EAFESAGITPDSTRGSD 
ACCGGCGTGTTCGTCGGCGCCTTCTCCTACGGTTACGGCACCGGTGCGGA 700 
TGVFVGAFS YGYGTGAD 
1 0 CACCGACGGCTTCGGCGCGACCGGCTCGCAGACCAGTGTGCTCTCCGGCC 7 50 
TDGFGATGSQTSVLSG 
GGCTGTCGTACTTCTACGGTCTGGAGGGTCCGGCGGTCACGGTCGACACG 800 
RLSYFYGLEGPAVTVDT 
GCGTGTTCGTCGTCGCTGGTGGCGCTGCACCAGGCCGGGCAGTCGCTGCG 850 
15 A.CSSSLVALHQAGQSLR 

CTCCGGCGAATGCTCGCTCGCCCTGGTCGGCGGCGTCACGGTGATGGCGT 900 
SGECS LAL VGGVTVMA 
n CTCCCGGCGGCTTCGTGGAGTTCTCCCGGCAGCGCGGCCTCGCGCCGGAC 950 

*T SPGGFVEFSRQRGLAPD 
~j 20 GGCCGGGCGAAGGCGTTCGGCGCGGGTGCGGACGGCACGAGCTTCGCCGA 1000 
*5 GRAKAFGAGADGTS FAE 

■f* GGGTGCCGGTGTGCTGATCGTCGAGAGGCTCTCCGACGCCGAACGCAACG 1050 

O GAGVL I VERLS DAE RN 

UJ GTCACACCGTCCTGGCGGTCGTCCGTGGTTCGGCGGTCAACCAGGATGGT 1100 

M, 25 G H. T V L A V V R G S A V N Q D G 
'ff) GCCTCCAACGGGCTGTCGGCGCCGAACGGGCCGTCGCAGGAGCGGGTGAT 1150 

a ASNGLSAPNGPSQERVI 

CCGGCAGGCCCTGGCCAACGCCGGGCTCACCCCGGCGGACGTGGACGCCG 1200 
RQALANAGLTPADVDA 
30 TCGAGGCCCACGGCACCGGCACCAGGCTGGGCGACCCCATCGAGGCACAG 1250 
VEAHGTGTRLGDPI EAQ 
GCGGTACTGGCCACCTACGGACAGGAGCGCGCCACCCCCCTGCTGCTGGG 1300 
53 AVLATYGQERATPLLLG 

CTCGCTGAAGTCCAACATCGGCCACGCCCAGGCCGCGTCCGGCGTCGCCG 1350 
35 SLKSNIGHAQAASGVA 

GCATCATCAAGATGGTGCAGGCCCTCCGGCACGGGGAGCTGCCGCCGACG 1400 
GI'IKMVQALRHGELPPT 
CTGCACGCCGACGAGCCGTCGCCGCACGTCGACTGGACGGCCGGCGCCGT 14 50 
LHADEPS PHVDWTAG AV 
40 CGAACTGCTGACGTCGGCCCGGCCGTGGCCCGAGACCGACCGGCC ACGGC 1500 
ELLTSARPWPETD RPR 
GTGCCGCCGTCTCCTCGTTCGGGGTGAGCGGCACCAACGCCCACGTCATC 1550 
RAAV 'SS FGVSGTNAHVI 
CTGGAGGCCGGACCGGTAACGGAGACGCCCGCGGCATCGCCTTCCGGTGA 1600 
45 LEAGPVTET PA AS PSGD 

CCTTCCCCTGCTGGTGTCGGCACGCTCACCGGAAGCGCTCGACGAGCAGA 1650 

LPLLVSARS PEALDEQ 
TCCGCCGACTGCGCGCCTACCTGGACACCACCCCGGACGTCGACCGGGTG 1700 
IRRLRAYLDTTPDVDRV 
50 GCCGTGGCACAGACGCTGGCCCGGCGCACACACTTCGCCCACCGCGCCGT 1750 
AVAQT LARRT H FAH RAV 
GCTGCTCGGTGACACCGTCATCACCACACCCCCCGCGGACCGGCCCGACG 1800 

LLGDTVITTPPADRPD 
AACTCGTCTTCGTCTACTCCGGCCAGGGCACCCAGCATCCCGCGATGGGC 1850 
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ELVFVYSGQGTQHPAMG 
GAGCAGCTAGCCGCCGCGTTCCCCGTCTTCGCGCGGATCCATCAGCAGGT 1900 

EQLAAAFPVFARIHQQV 
GTGGGACCTGCTCGATGTGCCCGATCTGGAGGTGAACGAGACCGGTTACG 1950 
5 WDLLDVPDLEVNETGY 

CCCAGCCGGCCCTGTTCGCAATGCAGGTGGCTCTGTTCGGGCTGCTGGAA 2000 
AQPALFAMQVALFGLLE 
TCGTGGGGTGTACGACCGGACGCGGTGATCGGCCATTCGGTGGGTGAGCT 2050 
SWGVRPDAVIGHS'VGEL 
10 TGCGGCTGCGTATGTGTCCGGGGTGTGGTCGTTGGAGGATGCCTGCACTT 2100 
AA AYVSGVWSLEDACT 
TGGTGTCGGCGCGGGCTCGTCTGATGCAGGCTCTGCCCGCGGGTGGGGTG 2150 
L V S A R A R L MQAL P AG G V 
ATGGTCGCTGTCCCGGTCTCGGAGGATGAGGCCCGGGCCGTGCTGGGTGA 2200 
15 MVAV PVS E DEARAVLGE 

GGGTGTGGAGATCGCCGCGGTCAACGGCCCGTCGTCGGTGGTTCTCTCCG 2250 

GVEIAAVNGPSSVVLS 
GTGATGAGGCCGCCGTGCTGCAGGCCGCGGAGGGGCTGGGGAAGTGGACG 2300 
Q GDEAAV LQAAEGLGKWT 

*B 20 CGGCTGGCGACCAGCCACGCGTTCCATTCCGCCCGTATGGAACCCATGCT 2350 
yQ RLATS HAFHSARMEPML 

jg GGAGGAGTTCCGGGCGGTCGCCGAAGGCCTGACCTACCGGACGCCGCAGG 24 00 

Q EEFRAVAEGLTYRTPQ 
| V J TCTCCATGGCCGTTGGTGATCAGGTGAGCACCGCTGAGTACTGGGTGCGG 2450 

2 25 V S M A V G D Q V T T A E Y W V R . . 

CAGGTCCGGGACACGGTCCGGTTCGGCGAGCAGGTGGCCTCGTACGAGGA 2500 
QVRDTVRFGEQVASYED 
L CGCCGTGTTCGTCGAGCTGGGTGCCGACCGGTCACTGGCCCGCCTGGTCG 2550 

^ AVFVELGADRSLARLV 
OB 30 ACGGTGTCGCGATGCTGCACGGCGACCACGAAATCCAGGCCGCGATCGGC 2 600 
rtj DGVAMLHGDHEIQAAIG 
Xj GCCCTGGCCCACCTGTATGTCAACGGCGTCACGGTCGACTGGCCCGCGCT 2 650 

Q ALAHLYVNGVTVDWPAL 
2 CCTGGGCGATGCTCCGGCAACACGGGTGCTGGACCTTCCGACATACGCCT 2700 

35 LGDAPATRVLDLPTYA 

TCCAGCACCAGCGCTACTGGCTCGAGTCGGCACGCCCGGCCGCATCCGAC 2750 
FQHQRYWLESARPAAS D 
GCGGGCCACCCCGTGCTGGGCTCCGGTATCGCCCTCGCCGGGTCGCCGGG 2800 
AGH PVLGSGIALAGSPG 
40 CCGGGTGTTCACGGGTTCCGTGCCGACCGGTGCGGACCGCGCGGTGTTCG 2850 
RVFTGSVPTGADRAVF 
TCGCCGAGCTGGCGCTGGCCGCCGCGGACGCGGTCGACTGCGCCACGGTC 2 900 
V A E LA LAAADAVDCAT V 
GAGCGGCTCGACATCGCCTCCGTGCCCGGCCGGCCGGGCCATGGCCGGAC 2 950 
45 ERLDIASV.PGRPGHGRT 

GACCGTACAGACCTGGGTCGACGAGCCGGCGGACGACGGCCGGCGCCGGT 3000 

TVQTWVDEP ADDGRRR 
TCACCGTGCACACCCGCACCGGCGACGCCCCGTGGACGCTGCACGCCGAG 3050 
FTVHTRTGDAPWTLHAE 
50 GGGGTGCTGCGCCCCCATGGCACGGCCCTGCCCGATGCGGCCGACGCCGA 3100 
GVL RP HGTALPDAADAE 
GTGGCCCCCACCGGGCGCGGTGCCCGCGGACGGGCTGCCGGGTGTGTGGC 3150 

WPPPGAVPADGLPGVW 
GCCGGGGGGACCAGGTCTTCGCCGAGGCCGAGGTGGACGGACCGGACGGT 3200 
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RRGDQVFA EAEVDGPDG 
TTCGTGGTGCACCCCGACCTGCTCGACGCGGTCTTCTCCGCGGTCGGCGA 3250 

FVVH P DLLDAVFSAVG D 
CGGAAGCCGCCAGCCGGCCGGATGGCGCGACCTGACGGTGCACGCGTCGG 3300 
5 GSRQPAGWR DLT VHAS 

ACGCCACCGTACTGCGCGCCTGCCTCACCCGGCGCACCGACGGAGCCATG 3350 
DATVLRACLTRRTDGAM 
GGATTCGCCGCCTTCGACGGCGCCGGCCTGCCGGTACTCACCGCGGAGGC 34 00 
GFAAFDGAGLPVLTAEA 
1 0 GGTGACGCTGCGGGAGGTGGCGTCACCGTCCGGCTCCGAGGAGTCGGACG 34 50 
VTLREVASPSGSEESD 
GCCTGCACCGGTTGGAGTGGCTCGCGGTCGCCGAGGCGGTCTACGACGGT 3500 
GLHRLEWLAVAEAVYDG 
GACCTGCCCGAGGGACATGTCCTGATCACCGCCGCCCACCCCGACGACCC 3550 
15 DLPEGHVLITAAHPDDP 

CGAGGACATACCCACCCGCGCCCACACCCGCGCCACCCGCGTCCTGACCG 3600 

EDI PTRAHTRATRVLT 
CCCTGCAACACCACCTCACCACCACCGACCACACCCTCATCGTCCACACC 3650 
ALQHHLTTTDHTLIVHT 
20 ACCACCGACCCCGCCGGCGCCACCGTCACCGGCCTCACCCGCACCGCCCA 3700 
gP TTDPAGATVTGLTRTAQ 
]g GAACGAACACCCCCACCGCATCCGCCTCATCGAAACCGACCACCCCCACA 3750 

£ NEHPHRIRLIETDHPH. 
W CCCCCCTCCCCCTGGCCCAACTCGCCACCCTCGACCACCCCCACCTCCGC 3800 

W 25 T P L P L A Q L A T L D H P H L R 
H CTCACCCACCACACCCTCCACCACCGCCACCTCACCCCCCTCCACACCAC 3850 

01 LTHHTLHHPHLTP'L HTT 

s CACCCCACCCACCACCACCCCCCTCAACCCCGAACACGCCATCATCATCA 3900 

Q T P P T T T PLNPEHAI I I 

CO ^ CCGGCGGCTCCGGCACCCTCGCCGGCATCCTCGCCCGCCACCTGAACCAC 3950 
fjj TGGSGTLAGILARHLNH 
l\ CCCCACACCTACCTCCTCTCCCGCACCCCACCCCCCGACGCCACCCCCGG 4 000 

^ PHTYLLSRTPPPDATPG 
W CACCCACCTCCCCTGCGACGTCGGCGACCCCCACCAACTCGCCACCACCC 4050 

^35 THLPCDVGDPHQLATT 

TCACCCACATCCCCCAACCCCTCACCGGCATCTTCCACACCGCCGCCACC 4100 
L T H I PQ PLTAI FHTAA T 
CTCGACGACGGCATCCTCCACGCCCTCACCCCCGACCGCCTCACCACCGT 4150 
LDDGI LHALTPDRLTT V 
40 CCTCCACCCCAAAGCCAACGCCGCCTGGCACCTGCACCACCTCACCCAAA 4200 
LHPKANAAWHLHHLTQ 
ACCAACCCCTCACCCACTTCGTCCTCTACTCCAGCGCCGCCGCCGTCCTC 4250 
N QPLTH FVLYSSAAAVL 
GGCAGCCCCGGACAAGGAAACTACGCCGCCGCCAACGCCTTCCTCGACGC 4300 
45 GSPGQGNYAAANAFLDA 

CCTCGCCACCCACCGCCACACCCTCGGCCAACCCGCCACCTCCATCGCCT 4 350 

LATHRHTLGQPAT.SIA 
GGGGCATGTGGCACACCACCAGCACCCTCACCGGACAACTCGACGACGCC 4 4 00 
WGMWHTTSTLTGQLDDA 
50 GACCGGGACCGCATCCGCCGCGGCGGTTTCCTCCCGATCACGGACGACGA 4 4 50 
DRDRIRRGGFLPITDDE 
GGGCATGGGGATGCAT 
G 
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) oTJ^The NheU^hol restriction fragment that encodes module 8 of the FK-520 PKS 
'with the endogenous\T domain replaced by the AT domain of module 13 (specific for 

methylmalonyl CoA) orthe rapamycin PKS has the DNA sequence and encodes the 

amino acid sequence shov^below. 

AGATCTGGCAGCTCGCCGAAGCGCTGCTGACGCTCGTCCGGGAGAGCACC 5 0 

QLAEALLTLVREST 
GCCGCCGTGCTCGGCCACGTGGGTGGCGAGGACATCCCCGCGACGGCGGC 100 

AAVLGHVGGEDI PATAA 
GTTCAAGGACCTCGGCATCGACTCGCTCACCGCGGTCCAGCTGCGCAACG 150 

FKDLGIDSLTAVQLRN 
CCCTCACCGAGGCGACCGGTGTGCGGCTGAACGCCACGGCGGTCTTCGAC 200 
ALTEATGVRLNATAVFD 
TTCCCGACCCCGCACGTGCTCGCCGGGAAGCTCGGCGACGAACTGACCGG 250 

FPT PHVLAGKLGDELTG 
CACCCGCGCGCCCGTCGTGCCCCGGACCGCGGCCACGGCCGGTGCGCACG 300 

T RA PVV P RT AATAGAH 
ACGAGCCGCTGGCGATCGTGGGAATGGCCTGCCGGCTGCCCGGCGGGGTC 350 
DEPLAIVGMACRLPGGV 
GCGTCACCCGAGGAGCTGTGGCACCTCGTGGCATCCGGCACCGACGCCAT 4 00 

AS PEELWHLVASGTDAI 
CACGGAGTTCCCGACGGACCGCGGCTGGGACGTCGACGCGATCTACGACC 4 50 

T E F P T D R G W D V D A. I Y D 
CGGACCCCGACGCGATCGGCAAGACCTTCGTCCGGCACGGTGGCTTCCTC 500 
PDPDAIGKTFVRHGGFL 
ACCGGCGCGACAGGCTTCGACGCGGCGTTCTTCGGCATCAGCCCGCGCGA 550 

TGATGFDAAFFGISPRE 
GGCCCTCGCGATGGACCCGCAGCAGCGGGTGCTCCTGGAGACGTCGTGGG 600 

ALAMDPQQRVLLETSW 
AGGCGTTCGAAAGCGCCGGCATCACCCCGGACTCGACCCGCGGCAGCGAC 650 
EAFESAGITPDSTRGSD 
ACCGGCGTGTTCGTCGGCGCCTTCTCCTACGGTTACGGCACCGGTGCGGA 700 

TGVFVGAFS YGYGTGAD 
CACCGACGGCTTCGGCGCGACCGGCTCGCAGACCAGTGTGCTCTCCGGCC 750 

TDGFGATGSQTSVLSG 
GGCTGTCGTACTTCTACGGTCTGGAGGGTCCGGCGGTCACGGTCGACACG 800 
RLSYFYGLEGPAVTVDT 
GCGTGTTCGTCGTCGCTGGTGGCGCTGCACCAGGCCGGGCAGTCGCTGCG 850 

ACS SSLVALHQAGQSLR 
CTCCGGCGAATGCTCGCTCGCCCTGGTCGGCGGCGTCACGGTGATGGCGT 900 

SGEC S LALVGGVTVMA 
CTCCCGGCGGCTTCGTGGAGTTCTCCCGGCAGCGCGGCCTCGCGCCGGAC 950 
SPGGFVEFSRQRGLAPD 
GGCCGGGCGAAGGCGTTCGGCGCGGGTGCGGACGGCACGAGCTTCGCCGA 1000 

GRAKAFGAGADGTSFAE 
GGGTGCCGGTGTGCTGATCGTCGAGAGGCTCTCCGACGCCGAACGCAACG 1050 

GAGVLIVERLSDAERN 
GTCACACCGTCCTGGCGGTCGTCCGTGGTTCGGCGGTCAACCAGGATGGT 1100 
GHTVLAVVRGSAVNQDG 
GCCTCCAACGGGCTGTCGGCGCCGAACGGGCCGTCGCAGGAGCGGGTGAT 1150 
ASNGLSAPNGPSQERVI 
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CCGGCAGGCCCTGGCCAACGCCGGGCTCACCCCGGCGGACGTGGACGCCG 1200 

RQALAN AGLT PADVDA 
TCGAGGCCCACGGCACCGGCACCAGGCTGGGCGACCCCATCGAGGCACAG 1250 
VEAHGTGTRLGDPIEAQ 
5 GCGGTACTGGCCACCTACGGACAGGAGCGCGCCACCCCCCTGCTGCTGGG 1300 
AVLATYGQERAT PLLLG 
CTCGCTGAAGTCCAACATCGGCCACGCCCAGGCCGCGTCCGGCGTCGCCG 1350 

SLKSNI GHAQAASGVA 
GCATCATCAAGATGGTGCAGGCCCTCCGGCACGGGGAGCTGCCGCCGACG 14 00 
10 GIIKMVQALRHGELPPT 

CTGCACGCCGACGAGCCGTCGCCGCACGTCGACTGGACGGCCGGCGCCGT 14 50 

LHADE PS PH VDWTAGAV 
CGAACTGCTGACGTCGGCCCGGCCGTGGCCCGAGACCGACCGGCCACGGC 1500 
ELLTSARPWPETDRPR 
1 5 GTGCCGCCGTCTCCTCGTTCGGGGTGAGGGGCACCAACGCCCACGTCATC 1550 
RAAVS SFGVSGTNAHVI 
CTGGAGGCCGGACCGGTAACGGAGACGCCCGCGGCATCGCCTTCCGGTGA 1600 

LEAGPVTET PAASPSGD 
CCTTCCCCTGCTGGTGTCGGCACGCTCACCGGAAGCGCTCGACGAGCAGA 1650 
20 LPLLVSARS PEALDEQ 

TCCGCCGACTGCGCGGCTACCTGGACACCACCCCGGACGTCGACCGGGTG 1700 
I RRLRAYLDTT PDVDRV 
GCCGTGGCACAGACGCTGGCCCGGCGCACACACTTCGCCCACCGCGCCGT 17 50 

AVAQTLA RR. THFAHRA V 
GCTGCTCGGTGACACCGTCATCACCACACCCCCCGCGGACCGGCCCGACG 1800 
L L G D TV IT T P- PA DRPD 
SH AACTCGTCTTCGTCTACTCCGGCCAGGGCACCCAGCATCCCGCGATGGGC 1850 

s ELVFVYSGQGTQHPAMG 
Q GAGCAGCTAGCCGATTCGTCGGTGGTGTTCGCCGAGCGGATGGCCGAGTG 1900 

03 30 EQLADSSV.VFAERMAEC 
ffj TGCGGCGGCGTTGCGCGAGTTCGTGGACTGGGATCTGTTCACGGTTCTGG 1950 

AAALREFVDWDLFTVL 
^2 ATGATCCGGCGGTGGTGGACCGGGTTGATGTGGTCCAGCCCGCTTCCTGG 2000 

fj DDPAVVDRV DVVQPASW 

^ 35 GCGATGATGGTTTCCCTGGCCGCGGTGTGGCAGGCGGCCGGTGTGCGGCC 2050 
AMMVS LAAVWQAAGVRP 
GGATGCGGTGATCGGCCATTCGCAGGGTGAGATCGCCGCAGCTTGTGTGG 2100 

DAVIGHSQGEIAAACV 
CGGGTGCGGTGTCACTACGCGATGCCGCCCGGATCGTGACCTTGCGCAGC 2150 
40 AGAVSLRDAARIVTLRS 

CAGGCGATCGCCCGGGGCCTGGCGGGCCGGGGCGCGATGGCATCCGTCGC 2200 

QAIARGLAGRGAMASVA 
CCTGCCCGCGCAGGATGTCGAGCTGGTCGACGGGGCCTGGATCGCCGCCC 2250 
LPAQDVELVDGAWIAA 
45 ACAACGGGCCCGCCTCCACCGTGATCGCGGGCACCCCGGAAGCGGTCGAC 2300 
HNGPASTVIA .GT PEAVD 
CATGTCCTCACCGCTCATGAGGCACAAGGGGTGCGGGTGCGGCGGATCAC 2350 
HVLTAHEAQGVRVRRIT 
' CGTCGACTATGCCTCGCACACCCCGCACGTCGAGCTGATCCGCGACGAAC 2400 
50 VDYASHTPHVELIRDE 

TACTCGACATCACTAGCGACAGCAGCTCGCAGACCCCGCTCGTGCCGTGG 2450 
LLDITSDSSSQTPLVPW 
CTGTCGACCGTGGACGGCACCTGGGTCGACAGCCCGCTGGACGGGGAGTA 2500 
LSTVDGTWVDS PLDGEY 
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CTGGTACCGGAACCTGCGTGAACCGGTCGGTTTCCACCCCGCCGTCAGCC 2550 

WYRNLREPVGFH PAVS 
AGTTGCAGGCCCAGGGCGACACCGTGTTCGTCGAGGTCAGCGCCAGCCCG 2600 
QL QAQGDTVFVEVSAS P 
5 GTGTTGTTGCAGGCGATGGACGACGATGTCGTCACGGTTGCCACGCTGCG 2 650 
VLLQAMDDDVVTVATLR 
TCGTGACGACGGCGACGCCACCCGGATGCTCACCGCCCTGGCACAGGCCT 2700 

RDDGDATRMLTALAQA 
ATGTCCACGGCGTCACCGTCGACTGGCCCGCCATCCTCGGCACCACCACA 27 50 
10 YVHGVTVDWPAILGTTT 

ACCCGGGTACTGGACCTTCCGACCTACGCCTTCCAACACCAGCGGTACTG 2800 

TRVLDLPTYAF QHQRYW 
GCTCGAGTCGGCACGCCCGGCCGCATCCGACGCGGGCCACCCCGTGCTGG 2850 
LESARPAAS DAGHPVL 
15 GCTCCGGTATCGCCCTCGCCGGGTCGCCGGGCCGGGTGTTCACGGGTTCC 2 900 
GSGIALAGS PGRVFTGS 
GTGCCGACCGGTGCGGACCGCGCGGTGTTCGTCGCCGAGCTGGCGCTGGC 2950 
^ VPTGADRAV FVAELALA 

Q CGCCGCGGACGCGGTCGACTGCGCCACGGTCGAGCGGCTCGACATCGCCT 3000 

20 AADAVDCATVERLDIA 
^ CCGTGCCCGGCCGGCCGGGCCATGGCCGGACGACCGTACAGACCTGGGTC 3050 

^ SVPGRPGHGRTTVQTWV 

GACGAGCCGGCGGACGACGGCCGGCGCCGGTTCACCGTGCACACCCGCAC 3100 
D.EPADDGRRRFT VHTRT 
25 CGGCGACGCCCCGTGGACGCTGCACGCCGAGGGGGTGCTGCGCCCCCATG 3 15 0 
G-DAPWT LHAE GVLR PH 
^ J GCACGGCCCTGCCCGATGCGGCCGACGCCGAGTGGCCCCCACCGGGCGCG 3200 

^ G TALPDAADAEWPP PGA 

Q GTGCCCGCGGACGGGCTGCCGGGTGTGTGGCGCCGGGGGGACCAGGTCTT 3250 

30 VPADGLPGVWRRGDQV.F 
f|| CGCCGAGGCCGAGGTGGACGGACCGGACGGTTTCGTGGTGCACCCCGACC 3300 

\l AEAEVDGPDGFV-VHPD 



TGCTCGACGCGGTCTTCTCCGCGGTCGGCGACGGAAGCCGCCAGCCGGCC 3350 
LLDAVFSAVGDGSRQPA 
35 GGATGGCGCGACCTGACGGTGCACGCGTCGGACGCCACCGTACTGCGCGC 34 00 
GWRDLTVHAS DATVL RA 
CTGCCTCACCCGGCGCACCGACGGAGCCATGGGATTCGCCGCCTTCGACG 34 50 

CLTRRTDGAMGFAAFD 
GCGCCGGCCTGCCGGTACTCACCGCGGAGGCGGTGACGCTGCGGGAGGTG 3500 
40 GAGLPVLTAEAVTLREV 

GCGTCACCGTCCGGCTCCGAGGAGTCGGACGGCCTGCACCGGTTGGAGTG 3550 

ASPSGSEES DGLHRLEW 
GCTCGCGGTCGCCGAGGCGGTCTACGACGGTGACCTGCCCGAGGGACATG 3600 
LAVAEAVYDGDLPEGH 
45 TCCTGATCACCGCCGCCCACCCCGACGACCCCGAGGACATACCCACCCGC 3650 
VLI TAAHPDDPEDI PTR 
GCCCACACCCGCGCCACCCGCGTCCTGACCGCCCTGCAACACCACCTCAC 3700 

AHTRATRVLTALQHHLT 
CACCACCGACCACACCCTCATCGTCCACACCACCACCGACCCCGCCGGCG 37 50 
50 TTDHTLIVHTTTDPAG 

CCACCGTCACCGGCCTCACCCGCACCGCCCAGAACGAACACCCCCACCGC 3800 
ATV. TGLTRTAQNEHPHR 
ATCCGCCTCATCGAAACCGACCACCCCCACACCCCCCTCCCCCTGGCCCA 3850 
IRLIETDHPHTPLPLAQ 
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ACTCGCCACCCTCGACCACCCCCACCTCCGCCTCACCCACCACACCCTCC 3900 

LATLDHPHLRLTHHTL 

ACCACCCCCACCTCACCCCCCTCCACACCACCACCCCACCCACCACCACC 3950 
HHPHLTPLHTTTPPTTT 

CCCCTCAACCCCGAACACGCCATCATCATCACCGGCGGCTCCGGCACCCT 4 000 

PLNPEHAI I ITGGSGTL 

CGCCGGCATCCTCGCCCGCCACCTGAACCACCCCCACACCTACCTCCTCT 4 050 

AGILARHLNHPHTYLL 

CCCGCACCCCACCCCCCGACGCCACCCCCGGCACCCACCTCCCCTGCGAC 4100 
SRTPPPDATPGTHLPCD 

GTCGGCGACCCCCACCAACTCGCCACCACCCTCACCCACATCCCCCAACC 4150 

VGDPHQLATTLTHIPQP 

CCTCACCGCCATCTTCCACACCGCCGCCACCCTCGACGACGGCATCCTCC 4 200 

LTAI FHTAATL'DDGIL 

ACGCCCTCACCCCCGACCGCCTCACCACCGTCCTCCACCCCAAAGCCAAC 4 250 
HALTPDRLTTVLHPKAN 

GCCGCCTGGCACCTGCACCACCTCACCCAAAACCAACCCCTCACCCACTT 4 300 

AAWHLHHLTQNQPLTH F 

CGTCCTCTACTCCAGCGCCGCCGCCGTCCTCGGCAGCCCCGGACAAGGAA 4 350 

VLYSSAAAVLGSPGQG 

ACTACGCCGCCGCCAACGCCTTCCTCGACGCCCTCGCCACCCACCGCCAC 4 4 00 
NYAAANAFLDALATHRH 

ACCCTCGGCCAACCCGCCACCTCCATCGCCTGGGGCATGTGGCACACCAC 4 4 50 

TLGQPA-TS IAWGMWHT T 

CAGCACCCTCACCGGACAACTCGACGACGCCGACCGGGACCGCATCCGCC 4 500 

STLT GQLDDAD RDRIR 
GCGGCGGTTTCCTCCCGATCACGGACGACGAGGGCATGGGGATGCAT 
RGGFLPITDDEG 

Phage KC515 DNA was prepared using the procedure described in Genetic 
Manipulation of Streptomyces, A Laboratory Manual, edited by D. Hopwood et aL A 
phage suspension prepared from 10 plates (100 mm) of confluent plaques of KC515 on S. 
lividans TK24 generally gave about 3 |ig of phage DNA. The DNA was ligated to 
circularize at the cos site, subsequently digested with restriction enzymes BamHl and 
Pstl, and dephosphorylated with SAP. 

Each module 8 cassette described above was excised with restriction enzymes 
Bglll and Nsil and ligated into the compatible BamHl and Pstl sites of KC515 phage 
DNA prepared as described above. The ligation mixture containing KC5 1 5 and various 
cassettes was transfected into protoplasts of Streptomyces lividans TK24 using the 
procedure described in Genetic Manipulation of Streptomyces,, A Laboratory Manual 
edited by D. Hopwood et al and overlaid with TK24 spores. After 16-24 hr, the plaques 
were restreaked on plates overlaid with TK24 spores. Single plaques were picked and 
resuspended in 200 |iL of nutrient broth. Phage DNA was prepared by the boiling method 
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(Hopwood et aL, supra). The PCR with primers spanning the left and right boundaries of 
the recombinant phage was used to verify the correct phage had been isolated. In most 
cases, at least 80% of the plaques contained the expected insert. To confirm the presence 
of the resistance marker (thiostrepton), a spot test is used, as described in Lomovskaya et 
al (1997), in which a plate with spots of phage is overlaid with mixture of spores of 
TK24 and phiC31 TK24 lysogen. After overnight incubation, the plate is overlaid with 
antibiotic in soft agar. A working stock is made of all phage containing desired 
constructs. 

Streptomyces hygroscopicus ATCC 14891 (see US Patent No. 3,244,592, issued 
5 Apr 1966, incorporated herein by reference) mycelia were infected with the 
recombinant phage by mixing the spores and phage (1 x 10 8 of each), and incubating on 
R2YE agar (Genetic Manipulation of Streptomyces, A Laboratory Manual, edited by D. 
Hopwood et aL) at 30°C for 10 days. Recombinant clones were selected and plated on 
minimal medium containing thiostrepton (50 ng/ml) to select for the thiostrepton 
resistance-conferring gene. Primary thiostrepton resistant clones were isolated and 
purified through a second round of single colony isolation, as necessary. To obtain 
thiostrepton-sensitive revertants that underwent a second recombination event to evict the 
phage genome, primary recombinants were propagated in liquid media for two to three 
days in the absence of thiostrepton and then spread on agar medium without thiostrepton 
to obtain spores. Spores were plated to obtain about 50 colonies per plate, and 
thiostrepton sensitive colonies were identified by replica plating onto thiostrepton 
containing agar medium. The PCR was used to determine which of the thiostrepton 
sensitive colonies reverted to the wild type (reversal of the initial integration event), and 
which contain the desired AT swap at module 8 in the ATCC 14891 -derived cells. The 
PCR primers used amplified either the KS/AT junction or the AT/DH junction of the 
wild-type and the desired recombinant strains. Fermentation of the recombinant strains, 
followed by isolation of the metabolites and analysis by LCMS, and NMR is used to 
characterize the novel polyketide compounds. 
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Example 2 

Replacement of Methoxyl with Hydrogen or Methyl at C-13 of FK-506 
The present invention also provides the 13-desmethoxy derivatives of FK-506 and 
the novel PKS enzymes that produce them. A variety of Streptomyces strains that produce 
5 FK-506 are known in the art, including S. tsukubaensis No. 9993 (FERM BP-927), 
described in U.S. Patent No. 5,624,852, incorporated herein by reference; S. 
hygroscopicus subsp. yakushimaensis No. 7238, described in U.S. patent No. 4,894,366, 
incorporated herein by reference; 5. sp. MA6858 (ATCC 55098), described in U.S. 
Patent Nos. 5,1 16,756, incorporated herein by reference; and S. sp. MA 6548, described 

10 in Motamedi et ah, 1998, "The biosynthetic gene cluster for the macrolactone ring of the 
immunosuppressant FK-506," Eur. J. Biochem. 256: 528-534, and Motamedi et al, 1997, 
"Structural organization of a multifunctional polyketide synthase involved in the 
biosynthesis of the macrolide immunosuppressant FK-506," Eur. J. Biochem. 244: 74-80, 
each of which is incorporated herein by reference. 

15 The complete sequence of the FK-506 gene cluster from Streptomyces sp. 

MA6548 is known, and the sequences of the corresponding gene clusters from other FK- 
506-producing organisms is highly homologous thereto. The novel FK-506 recombinant 
gene clusters of the present invention differ from the naturally occurring gene clusters in 
that the AT domain of module 8 of the naturally occurring PKSs is replaced by an AT 

20 domain specific for malonyl CoA or methylmalonyl CoA. These AT domain 

replacements are made at the DNA level, following the methodology described in 
Example 1. 

The naturally occurring module 8 sequence for the MA6548 strain is shown 

followed by tnkallustrative hybrid module 8 sequences for the MA6548 strains. 

25 gcatgcggctgtacgaggcqgcacggcgcaccggaagtcccgtggtggtg 5 0 
m r l y e a\a rrtg s pvvv 
gcggccgcgctcgacgacgcgocggacgtgccgctgctgcgcgggctgcg 100 
a a a l d d a p\ d vpllrglr 
gcgtacgaccgtccggcgtgccgcogtccgggaacgctctctcgccgacc 150 
30 rttvrraa\rerslad 

gctcgccgtgctgcccgacgacgagcgtgccgacgcctccctcgcgttcg 200 

RSPCCPTTS AV P T P P S R S 
TCCTGGAACAGCACCGCCACCGTGCTCGgV:ACCTGGGCGCCGAAGACAT 250 
SWNSTATVLG \i L G A E D I 
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CCCGGCGACGACGACGTTCAAGGAACTCGGCATCGACTCGCTCACCGCGG 300 

PATTTFKELGI DSLTA 
TCCAGCTGCGCAACGCGCTGACCACGGCGACCGGCGTACGCCTCAACGCC 350 
V QLRNALTTATGVRLNA 
5 ACAGCGGTCTTCGACTTTCCGACGCCGCGCGCGCTCGCCGCGAGACTCGG 400 

TAVFDFPT PRALAARLG 
CGACGAGCTGGCCGGTACCCGCGCGCCCGTCGCGGCCCGGACCGCGGCCA 4 50 

DELAGTRAPVAARTAA 
CCGCGGCCGCGCACGACGAACCGCTGGCGATCGTGGGCATGGCCTGCCGT 500 
10 TAAAHDEPLAIVGMACR 

CTGCCGGGCGGGGTCGCGTCGCCACAGGAGCTGTGGCGTCTCGTCGCGTC 550 

LPGGVAS PQELWRLVAS 
CGGGACCGACGCCATCACGGAGTTCCCCGCGGACCGCGGCTGGGACGTGG 600 
GTDAITEFPADRGWDV 
15 ACGCGCTCTACGACCCGGACCCCGACGCGATCGGCAAGACCTTCGTCCGG 650 

. DALYDPDPDAI GKT FVR 

CACGGCGGCTTCCTCGACGGTGCGACCGGCTTCGACGCGGCGTTCTTCGG 700. 
HGGFLDGATGFDAAFFG 
O GATCAGCCCGCGCGAGGCCCTGGCCATGGACCCGCAGCAACGGGTGCTCC 750 

y3 20 I S PREALAMD PQQR VL 

-J3 TGGAGACGTCCTGGGAGGCGTTCGAAAGCGCGGGCATCACCCCGGACGCG 800 

f LETSWEAFESAGITPDA 
q GCGCGGGGCAGCGACACCGGCGTGTTCATCGGCGCGTTCTCCTACGGGTA 850 

II ARGSD TGVFI GAF.SYGY 

j[7 25 CGGCACGGGTGCGGATACCAACGGCTTCGGCGCGACAGGGTCGCAGACCA 900 

■ G T G A D T N *G F GAT G S Q T 

GCGTGCTCTCCGGCCGCCTCTCGTACTTCTACGGTCTGGAGGGCCCTTCG 950 
2 _ SVLSGRLSY FYGLEGPS 

Q GTCACGGTCGACACCGCCTGCTCGTCGTCACTGGTCGCCCTGCACCAGGC 1000 

00 30 VTVDTAC SSSLVALHQA 

fy AGGGCAGTCCCTGCGCTCGGGCGAATGCTCGCTCGCCCTGGTCGGCGGTG 1050 

GQSLRSGECSLALVGG 
TCACGGTGATGGCGTCGCCCGGCGGATTCGTCGAGTTCTCCCGGCAGCGC 1100 
VTVMAS PGG FVEFSRQR 
3 5 GGGCTCGCGCCGGACGGGCGGGCGAAGGCGTTCGGCGCGGGCGCGGACGG 1150 

GLAPDGRAKAFGAGADG 
TACGAGCTTCGCCGAGGGCGCCGGTGCCCTGGTGGTCGAGCGGCTCTCCG 1200 

TS FAEGA GALVVERLS 
ACGCGGAGCGCCACGGCCACACCGTCCTCGCCCTCGTACGCGGCTCCGCG 1250 
40 DAERHGHTVLALVRGSA 

GCTAACTCCGACGGCGCGTCGAACGGTCTGTCGGCGCCGAACGGCCCCTC 1300 

ANSDGASNGLSAPNGPS 
CCAGGAACGCGTCATCCACCAGGCCCTCGCGAACGCGAAACTCACCCCCG 1350 
QER VI HQALANAKLT P 
45 CCGATGTCGACGCGGTCGAGGCGCACGGCACCGGCACCCGCCTCGGCGAC 14 00 

ADVDAVEAHGTGTRLGD 
CCCATCGAGGCGCAGGCGCTGCTCGCGACGTACGGACAGGACCGGGCGAC 1450 

PI EAQALLATYGQDR AT 
GCCCCTGCTGCTCGGCTCGCTGAAGTCGAACATCGGGCACGCCCAGGCCG 1500 
50 PL LLGSLKSNIGHAQA 

CGTCAGGGGTCGCCGGGATCATCAAGATGGTGCAGGCCATCCGGCACGGG 1550 
ASGVAGI IKMVQAI RHG 
GAACTGCCGCCGACACTGCACGCGGACGAGCCGTCGCCGCACGTCGACTG 1600 
ELPPTLHADEPSPHVDW 
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GACGGCCGGTGCCGTCGAGCTCCTGACGTCGGCCCGGCCGTGGCCGGGGA 1650 

TAGAVELLTSARPWPG 
CCGGTCGCCCGCGCCGCGCTGCCGTCTCGTCGTTCGGCGTGAGCGGCACG 17 00 
TGRPRRAAVSSFGVSGT 
5 AACGCCCACATCATCCTTGAGGCAGGACCGGTCAAAACGGGACCGGTCGA 1750 

NAHI ILEAGPVKTGPVE 
GGCAGGAGCGATCGAGGCAGGACCGGTCGAAGTAGGACCGGTCGAGGCTG 1800 

A G A I EAGPVEVGPVEA 
GACCGCTCCCCGCGGCGCCGCCGTCAGCACCGGGCGAAGACCTTCCGCTG 1850 
10 G .PLPAAPPSAPGEDLPL 

CTCGTGTCGGCGCGTTCCCCGGAGGCACTCGACGAGCAGATCGGGCGCCT 1900 

LVS ARS P.EAL DEQI G RL 
GCGCGCCTATCTCGACACCGGCCCGGGCGTCGACCGGGCGGCCGTGGCGC 1950 
RAYLDTG PGVDRAA V A 
1 5 AGACACTGGCCCGGCGTACGCACTTCACCCACCGGGCCGTACTGCTCGGG 2000 

QTLARRTHFTHRAVLLG 
GACACCGTCATCGGCGCTCCCCCCGCGGACCAGGCCGACGAACTCGTCTT 2050 

DTVIGAPPADQADELVF 
CGTCTACTCCGGTCAGGGCACCCAGCATCCCGCGATGGGCGAGCAACTCG 2100 
20 VYSGQGTQHPAMGEQL 

CGGCCGCGTTCCCCGTGTTCGCCGATGCCTGGCACGACGCGCTCCGACGG 2150 
AAAFPVFADAWHDALRR 
CTCGACGACCCCGACCCGCACGACCCCACACGGAGCCAGCACACGCTCTT 2200 
L DD P D P H D .PT RSQHT.L F 
25 CGCCCACCAGGCGGCGTTCACCGCCCTCCTGAGGTCCTGGGACATCACGC 2250 

A. H Q AA F T. A L L R S W D I T 
CGCACGCCGTCATCGGCCACTCGCTCGGCGAGATCACCGCCGCGTACGCC 2300 
PHAVI GHSLGEITAAYA 
GCCGGGATCCTGTCGCTCGACGACGCCTGCACCCTGATCACCACGCGTGC 2350 
30 AGI LSLDDACTLITTRA 

CCGCCTCATGCACACGCTTCCGCCGCCCGGCGCCATGGTCACCGTGCTGA 24 00 

RLMHTLPPPGAMVTVL 
CCAGCGAGGAGGAGGCCCGTCAGGCGCTGCGGCCGGGCGTGGAGATCGCC 24 50 
TSEEEARQALRPGV EIA 
3 5 GCGGTCTTCGGCCCGCACTCCGTCGTGCTCTCGGGCGACGAGGACGCCGT 2500 

AVFGPHSVVLSGDEDAV 
GCTCGACGTCGCACAGCGGCTCGGCATCCACCACCGTCTGCCCGCGCCGC 2550 

LDVAQRLGIHHRLPAP 
ACGCGGGCCACTCCGCGCACATGGAACCCGTGGCCGCCGAGCTGCTCGCC 2 600 
40 HAGHSAHMEPVAAELLA 

ACCACTCGCGAGCTCCGTTACGACCGGCCCCACACCGCCATCCCGAACGA 2 650 

TTRELRYDRPHTAIPND 
CCCCACCACCGCCGAGTACTGGGCCGAGCAGGTCCGCAACCCCGTGCTGT 2700 
PTTAEYWAEQVRNPVL 
45 TCCACGCCCACACCCAGCGGTACCCCGACGCCGTGTTCGTCGAGATCGGC 27 50 

F HAH T QRY P DAV FVE I G 
CCCGGCCAGGACCTCTCACCGCTGGTCGACGGCATCGCCCTGCAGAACGG 2800 

PG QD LSPLVDGIALQNG 
CACGGCGGACGAGGTGCACGCGCTGCACACCGCGCTCGCCCGCCTCTTCA 2850 
50 TADEVHALHTALARLF 

CACGCGGCGCCACGCTCGACTGGTCCCGCATCCTCGGCGGTGCTTCGCGG 2900 
TRGATLDWSRILGGASR 
CACGACCCTGACGTCCCCTCGTACGCGTTCCAGCGGCGTCCCTACTGGAT 2950 
HDPDVPSYAFQRRPYWI 
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CGAGTCGGCTCCCCCGGCCACGGCCGACTCGGGCCACCCCGTCCTCGGCA 3000 

ESAPPATADSGHPVLG 
CCGGAGTCGCCGTCGCCGGGTCGCCGGGCCGGGTGTTCACGGGTCCCGTG 3050 
TGVAVAGSPG RVFTGPV 
5 CCCGCCGGTGCGGACCGCGCGGTGTTCATCGCCGAACTGGCGCTCGCCGC 3100 

PAGADRAV F IAELALAA 
CGCCGACGCCACCGACTGCGCCACGGTCGAACAGCTCGACGTCACCTCCG 3150 

ADATDCATVEQLDVTS 
TGCCCGGCGGATCCGCCCGCGGCAGGGCCACCGCGCAGACCTGGGTCGAT 3200 
10 VPGGSARGRATAQTWVD 

GAACCCGCCGCCGACGGGCGGCGCCGCTTCACCGTCCACACCCGCGTCGG 3250 

EPAADGRRRFTVHTRVG 
CGACGCCCCGTGGACGCTGCACGCCGAGGGGGTTCTCCGCCCCGGCCGCG 3300 
DAPWTLHAEGVLRPGR 
15 TGCCCCAGCCCGAAGCCGTCGACACCGCCTGGCCCCCGCCGGGCGCGGTG 3350 

VPQPEAVDTAWPPPGAV 
CCCGCGGACGGGCTGCCCGGGGCGTGGCGACGCGCGGACCAGGTCTTCGT 34 00 
PADGLPGAWRRADQVFV 
Q CGAAGCCGAAGTCGACAGCCCTGACGGCTTCGTGGCACACCCCGACCTGC 34 50 

iJQ 20 EAEVDSPDGFVAHPDL 
$ TCGACGCGGTCTTCTCCGCGGTCGGCGACGGGAGCCGCCAGCCGACCGGA 3500 

jg LDAVFSAVG D GSRQPTG 

J£ TGGCGCGACCTCGCGGTGCACGCGTCGGACGCCACCGTGCTGCGCGCCTG 3550 

Tf t WRDLAVHAS DATVLRAC 

fj 25 CCTCACCCGCCGCGACAGTGGTGTCGTGGAGCTCGCCGCCTTCGACGGTG 3600 

LTRRDSG-VV E LAAFDG 
CP CCGGAATGCCGGTGCTCACCGCGGAGTCGGTGACGCTGGGCGAGGTCGCG 3650 

s AGMPVL TAESVTLGEVA 

Q TCGGCAGGCGGATCCGACGAGTCGGACGGTCTGCTTCGGCTTGAGTGGTT 3700 

fg 30 SAGGSDESDGLLRLEWL 
jy GCCGGTGGCGGAGGCCCACTACGACGGTGCCGACGAGCTGCCCGAGGGCT 3750 

PVAE.AHYDGADELPEG 
ACACCCTCATCACCGCCACACACCCCGACGACCCCGACGACCCCACCAAC 38 00 
YTL I TATH P DDPDDPTN 
35 CCCCACAACACACCCACACGCACCCACACACAAACCACACGCGTCCTCAC 3850 

PHNTPTRTHTQTTRVLT 
CGCCCTCCAACACCACCTCATCACCACCAACCACACCCTCATCGTCCACA 3900 

ALQHHLITTNHTLIVH 
CCACCACCGACCCCCCAGGCGCCGCCGTCACCGGCCTCACCCGCACCGCA 3950 
40 TTT DPPGAAVTGLTRTA 

CAAAACGAACACCCCGGCCGCATCCACCTCATCGAAACCCACCACCCCCA 4 000 

QNEHPGRIHLIETHHPH 
CACCCCACTCCCCCTCACCCAACTCACCACCCTCCACCAACCCCACCTAC 4050 
TPLPLT QLTTLHQPHL 
45 GCCTCACCAACAACACCCTCCACACCCCCCACCTCACCCCCATCACCACC 4100 

RLTNNTLHTPHLTPITT 
CACCACAACACCACCACAACCACCCCCAACACCCCACCCCTCAACCCCAA 4150 

HHNTTTTTPNTPPLNPN 
CCACGCCATCCTCATCACCGGCGGCTCCGGCACCCTCGCCGGCATCCTCG 4200 
50 HAILITGGSGTLAGIL 

CCCGCCACCTCAACCACCCCCACACCTACCTCCTCTCCCGCACACCACCA 4250 
ARHLNHPHTYLLSRTPP 
CCCCCCACCACACCCGGCACCCACATCCCCTGCGACCTCACCGACCCCAC 4300 
PPTTPGTHI PCDLTDPT 
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CCAAATCACCCAAGCCCTCACCCACATACCACAACCCCTCACCGGCATCT 4 350 

QITQALTHIPQPLTGI 
TCCACACCGCGGCCACCCTCGACGACGCCACCCTCACCAACCTCACCCCC 4 400 
FHTAATLDDATLTNLTP 
5 CAACACCTCACCACCACCCTCCAACCCAAAGCCGACGCCGCCTGGCACCT 4 4 50 

QHLTTTLQPKADAAWHL 
CCACCACCACACCCAAAACCAACCCCTCACCCACTTCGTCCTCTACTCCA 4 500 

HHHTQNQPLTHFVLYS 
GCGCGGCCGCCACCCTCGGCAGCCCCGGCCAAGCCAACTACGCCGCCGCC 4 550 
10 SAAATLGS PGQANYAAA 

AACGCCTTCCTCGACGCCCTCGCCACCCACCGCCACACCCAAGGACAACC 4 600 

NAFLDALATHRHTQGQP 
CGCCACCACCATCGCCTGGGGCATGTGGCACACCACCACCACACTCACCA 4 650 
ATT IAWGMWHTTTTLT 
1 5 GCCAACTCACCGACAGCGACCGCGACCGCATCCGCCGCGGCGGCTTCCTG 4700 

SQLTDSDRDRIRRGGFL 
CCGATCTCGGACGACGAGGGCATGC 
PISDDEGM 

2Q ^J*$P The Av)fhXhol hybrid FK-506 PKS module 8 containing the AT domain of 



module 12 of rapWycin is shown below. 

C3 GCATGCGGCTGTACGAGGCGGCACGGCGCACCGGAAGTCCCGTGGTGGTG 5 0 

W . - MRL Y. EA ARRTGS PVVV 

f=£ GCGGCCGCGCTCGACGACGCGCCGGACGTGCCGCTGCTGCGCGGGCTGCG 100 

gi 25 AAALDDAPDVPLL RGLR 

_ GCGTACGACCGTCCGGCGTGCCGCCGTCCGGGAACGCTCTCTCGCCGACC 150 

Q RTT VRRAAVRERSLAD 

GCTCGCCGTGCTGCCCGACGACGAGCGCGCCGACGCCTCCCTCGCGTTCG 200 



RSPCCPTT SAPTPPSRS 
30 TCCTGGAACAGCACCGCCACCGTGCTCGGCCACCTGGGCGCCGAAGACAT 250 
SWNSTATVLGHLGAEDI 
CCCGGCGACGACGACGTTCAAGGAACTCGGCATCGACTCGCTCACCGCGG 300 

PATTTFKELGIDSLTA 
TCCAGCTGCGCAACGCGCTGACCACGGCGACCGGCGTACGCCTCAACGCC 350 
35 VQLRNALTTATGVRLNA 

ACAGCGGTCTTCGACTTTCCGACGCCGCGCGCGCTCGCCGCGAGACTCGG 4 00 

TAVFDFPT PRALAARLG 
CGACGAGCTGGCCGGTACCCGCGCGCCCGTCGCGGCCCGGACCGCGGCCA 450 
DELAGTRAPVAARTAA 
40 CCGCGGCCGCGCACGACGAACCGCTGGCGATCGTGGGCATGGCCTGCCGT 500 
TAAAHDEPLAIVGMACR 
CTGCCGGGCGGGGTCGCGTCGCCACAGGAGCTGTGGCGTCTCGTCGCGTC 550 

LPGGVASPQELWRLVAS 
CGGCACCGACGCCATCACGGAGTTCCCCGCGGACCGCGGCTGGGACGTGG 600 
45 GTDAI TEFPADRGWDV 

ACGCGCTCTACGACCCGGACCCCGACGCGATCGGCAAGACCTTCGTCCGG 650 
DALYDPDPDAIGKTFVR 
CACGGCGGCTTCCTCGACGGTGCGACCGGCTTCGACGCGGCGTTCTTCGG 700 
HGGFLDGATGFDAAFFG 
50 GATCAGCCCGCGCGAGGCCCTGGCCATGGACCCGCAGCAACGGGTGCTCC 750 
I S P.REALAMDPQQRVL 
TGGAGACGTCCTGGGAGGCGTTCGAAAGCGCGGGCATCACCCCGGACGCG 800 
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LETSWEAFESAGITPDA 
GCGCGGGGCAGCGACACCGGCGTGTTCATCGGCGCGTTCTCCTACGGGTA 850 

ARGSDTGVFIGAFSYGY 
CGGCACGGGTGCGGATACCAACGGCTTCGGCGCGACAGGGTCGCAGACCA 900 
5 GTGA DTNGFGATGSQT 

GCGTGCTCTCCGGCCGCCTCTCGTACTTCTACGGTCTGGAGGGCCCTTCG 950 
SVLSGRLS YFYGLEGPS 
GTCACGGTCGACACCGCCTGCTCGTCGTCACTGGTCGCCCTGCACCAGGC 1000 
VTVDTACSS S LVALHQA 
1 0 AGGGCAGTCCCTGCGCTCGGGCGAATGCTCGCTCGCCCTGGTCGGCGGTG 1050 
GQSLRSGECSLALVGG 
TCACGGTGATGGCGTCGCCCGGCGGATTCGTCGAGTTCTCCCGGCAGCGC 1100 
VTVMASPGGFVEFS RQR 
GGGCTCGCGCCGGACGGGCGGGCGAAGGCGTTCGGCGCGGGCGCGGACGG 1150 
15 GLAP DGR 'AKA FGA GADG 

TACGAGCTTCGCCGAGGGCGCCGGTGCCCTGGTGGTCGAGCGGCTCTCCG 1200 

TSFAEGAGALVVERLS 
ACGCGGAGCGCCACGGCCACACCGTCCTCGCCCTCGTACGCGGCTCCGCG 1250 
DAERHGHTVLALVRGSA 
20 GCTAACTCCGACGGCGCGTCGAACGGTCTGTCGGCGCCGAACGGCCCCTC 1300 
AN SDGASNGLSAPNGPS 
CCAGGAACGCGTCATCCACCAGGCCCTCGCGAACGCGAAACTCACCCCCG 1350 
QERVIHQALANAKLTP 
^ CCGATGTCGACGCGGTCGAGGCGCACGGCACCGGCACCCGCCTCGGCGAC 1400 

!*? 25 A D V D A V . E A H G T G T R L G D 
H CCCATCGAGGCGCAGGCGCTGCTCGCGACGTACGGACAGGACCGGGCGAC 14 50 

OH P I EAQALLAT YGQ'DRAT 

2^ GCCCCTGCTGCTCGGCTCGCTGAAGTCGAACATCGGGCACGCCCAGGCCG 1500 

Q PLLLGSLKSNIGHAQA 

30 CGTCAGGGGTCGCCGGGATCATCAAGATGGTGCAGGCCATCCGGCACGGG 1550 
fU ASGVAGI I KMVQAI RHG 

GAACTGCCGCCGACACTGCACGCGGACGAGCCGTCGCCGCACGTCGACTG 1600 

EL PPTLHADEPSPHVDW 
GACGGCCGGTGCCGTCGAGCTCCTGACGTCGGCCCGGCCGTGGCCGGGGA 1650 
35 TAGAVELLTSARPWPG 

CCGGTCGCCCTAGGCGGGCAGGCGTGTCGTCCTTCGGGATCAGTGGCACC 1700 
TGRPRRA GVS S FG. I SGT 
AACGCCCACGTCATCCTGGAAAGCGCACCCCCCACTCAGCCTGCGGACAA 1750 
NAHV ILESAPPTQPADN 
40 CGCGGTGATCGAGCGGGCACCGGAGTGGGTGCCGTTGGTGATTTCGGCCA 1800 
AVIERAPEWVPLVI S A 
GGACCCAGTCGGCTTTGACTGAGCACGAGGGCCGGTTGCGTGCGTATCTG 1850 
RTQSALTEHEGRLRAYL 
GCGGCGTCGCCCGGGGTGGATATGCGGGCTGTGGCATCGACGCTGGCGAT 1900 
45 AAS PGVDMRAVAS T L AM 

GACACGGTCGGTGTTCGAGCACCGTGCCGTGCTGCTGGGAGATGACACCG 1950 

TRSVFEHRAVLLGDDT 
TCACCGGCACCGCTGTGTCTGACCCTCGGGCGGTGTTCGTCTTCCCGGGA 2000 
VTGTAVSDPRAVFVFPG 
50 CAGGGGTCGCAGCGTGCTGGCATGGGTGAGGAACTGGCCGCCGCGTTCCC 2050 
QGSQRAGMGEELAAAFP 
CGTCTTCGCGCGGATCCATCAGCAGGTGTGGGACCTGCTCGATGTGCCCG 2100 

VFARI HQQVWDLLDVP 
ATCTGGAGGTGAACGAGACCGGTTACGCCCAGCCGGCCCTGTTCGCAATG 2150 
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DLEVNETGYAQPALFAM 
CAGGTGGCTCTGTTCGGGCTGCTGGAATCGTGGGGTGTACGACCGGACGC 2200 

QVAL FGLLESWGVRP DA 
GGTGATCGGCCATTCGGTGGGTGAGCTTGCGGCTGCGTATGTGTCCGGGG 2250 
5 VIG HSVGELAAAYVSG 

TGTGGTCGTTGGAGGATGCCTGCACTTTGGTGTCGGCGCGGGCTCGTCTG 2300 
VWS LEDACTLVS ARARL 
ATGCAGGCTCTGCCCGCGGGTGGGGTGATGGTCGCTGTCCCGGTCTCGGA 2350 
MQALPAGGVMVAVPVS E 
1 0 GGATGAGGCCCGGGCCGTGCTGGGTGAGGGTGTGGAGATCGCCGCGGTCA 24 00 
DEARAVLGEGVE IAAV 
ACGGCCCGTCGTCGGTGGTTCTCTCCGGTGATGAGGCCGCCGTGCTGCAG 24 50 
NGPSSVVLSGDEAAVLQ 
GCCGCGGAGGGGCTGGGGAAGTGGACGCGGCTGGCGACCAGCCACGCGTT 2500 
15 AAEGLGKWTRLATS HAF 

CCATTCCGCCCGTATGGAACCCATGCTGGAGGAGTTCCGGGCGGTCGCCG 2550 

HSARMEPMLEEFRAVA 
AAGGCCTGACCTACCGGACGCCGCAGGTCTCCATGGCCGTTGGTGATCAG 2600 
EGLTYRTPQVSMAVGDQ 
20 GTGACCACCGCTGAGTACTGGGTGCGGCAGGTCCGGGACACGGTCCGGTT 2 650 
VTTAEYWVRQVRDTVRF 
CGGCGAGGAGGTGGCCTCGTACGAGGACGCCGTGTTCGTCGAGCTGGGTG 2700 

GEQVASYEDAVFVEL G 
CCGACCGGTCACTGGCCCGCCTGGTCGACGGTGT.CGCGATGCTGCACGGC 2750 
25 A D R S L A R L V D G VAM L H G 

GACCACGAAATCCAGGCCGCGATCGGCGCCCTGGCCCACCTGTATGTCAA 2800 

DHE IQAAIGALAHLYVN 
CGGCGTCACGGTCGACTGGCCCGCGCTCCTGGGCGATGCTCCGGCAACAC 2850 
GVTVDWPALLGDAPAT 
30 GGGTGCTGGACCTTCCGACATACGCCTTCCAGCACCAGCGCTACTGGCTC 2900 
RVLDLPTYAFQHQRYWL 
GAGTCGGCTCCCCCGGCCACGGCCGACTCGGGCCACCCCGTCCTCGGCAC 2 950 

ESAPPATADSGHPVLGT 
CGGAGTCGCCGTCGCCGGGTCGCCGGGCCGGGTGTTCACGGGTCCCGTGC 3000 
35 GVAVAGSPGRVFTGPV 

CCGCCGGTGCGGACCGCGCGGTGTTCATCGCCGAACTGGCGCTCGCCGCC 3050 
PA GAD RAV F I AE LALAA 
GCCGACGCCACCGACTGCGCCACGGTCGAACAGCTCGACGTCACCTCCGT 3100 
ADATDCATVEQLDVTSV 
40 GCCCGGCGGATCCGCCCGCGGCAGGGCCACCGCGCAGACCTGGGTCGATG 3150 
P G G S A R G RA T AQ .T.W V D 
AACCCGCCGCCGACGGGCGGCGCCGCTTCACCGTCCACACCCGCGTCGGC 3200 
EPAADGRRRFTVHTRVG 
GACGCCCCGTGGACGCTGCACGCCGAGGGGGTTCTCCGCCCCGGCCGCGT 3250 
45 DAPWTLHAEGV LRPGRV 

GCCCC AGCCCGAAGCCGTCGAC ACCGCCTGGCCCCCGCCGGGCGCGGTGC 3300 

PQPEAVDTAWPPPGAV 
CCGCGGACGGGCTGCCCGGGGCGTGGCGACGCGCGGACCAGGTCTTCGTC 3350 
PADGLPGAWRRADQVFV 
50 GAAGCCGAAGTCGACAGCCCTGACGGCTTCGTGGCACACCCCGACCTGCT 34 00 
EAEVDSPDGFVAHPDLL 
CGACGCGGTCTTCTCCGCGGTCGGCGACGGGAGCCGCCAGCCGACCGGAT 34 50 

DAVFSAVGDGSRQPTG 
GGCGCGACCTCGCGGTGCACGCGTCGGACGCCACCGTGCTGCGCGCCTGC 3500 
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WRDLAVHAS DATVLR -AC 
CTCACCCGCCGCGACAGTGGTGTCGTGGAGCTCGCCGCCTTCGACGGTGC 3550 

LTRRDS GVVELAAFDGA 
CGGAATGCCGGTGCTCACCGCGGAGTCGGTGACGCTGGGCGAGGTCGCGT 3600 
5 GMPVLTAESVTLGEVA 

CGGCAGGCGGATCCGACGAGTCGGACGGTCTGCTTCGGCTTGAGTGGTTG 3650 
SAGGSDESDGLLRLEWL 
CCGGTGGCGGAGGCCCACTACGACGGTGCCGACGAGCTGCCCGAGGGCTA 3700 
PVAEAHYDGADELPEGY 
10 CACCCTCATCACCGCCACACACCCCGACGACCCCGACGACCCCACCAACC 3750 
T L I. TATHPDDPDDPTN 
CCCACAACACACCCACACGCACCCACACACAAACCACACGCGTCCTCACC 3800 
PHNTPTRTHTQTTRVLT 
GCCCTCCAACACCACCTCATCACCACCAACCACACCCTCATCGTCCACAC 3850 
15 ALQHHLITTNHTLIVHT 

CACCACCGACCCCCCAGGCGCCGCCGTCACCGGCCTCACCCGCACCGCAC 3900 

TTDPPGAAVTGLTRTA 
AAAACGAACACCCCGGCCGCATCCACCTCATCGAAACCCACCACCCCCAC 3950 
Q QNEHPGRIHLIETHHPH 
g3 20 ACCCCACTCCCCCTCACCCAACTCACCACCCTCCACCAACCCCACCTACG 4 000 
gfl TPLPLTQLTTLHQPHLR 
jg CCTCACCAACAACACCCTCCACACCCCCCACCTCACCCCCATCACCACCC 4 050 

q LTNNTLHTPHLTPITT 
!j ACCACAACACCACCACAACCACCCCCAACACCCCACCCCTCAACCCCAAC 4100 ■ 

fj 25 H H N T T T T T P N T.P P L N" P N 
Jf CACGCCATCCTCATCACCGGCGGCTCCGGCACCCTCGCCGGCATCCTCGC 4150 

CH HAI LI TG GSGTLAGILA. 

b CCGCCACCTCAACCACCCCCACACCTACCTCCTCTCCCGCACACCACCAC 4 200 

Q RHLNHP HTYLLSRTPP 

CQ 30 CCCCCACCACACCCGGCACCCACATCCCCTGCGACCTCACCGACCCCACC 4 250 
jfjjj PPTTPGTHI PCDLTDPT 

l^j CAAATCACCCAAGCCCTCACCCACATACCACAACCCCTCACCGGCATCTT 4 300 

Q QITQALTHIPQPLTGIF 
^ CCACACCGCCGCCACCCTCGACGACGCCACCCTCACCAACCTCACCCCCC 4 350 

^35 HTAATLDDATLTNLTP 

AACACCTCACCACCACCCTCCAACCCAAAGCCGACGCCGCCTGGCACCTC 4 400 
QHLTTTLQPKADAAWHL 
CACCACCACACCCAAAACCAACCCCTCACCCACTTCGTCCTCTACTCCAG 4 4 50 
HHHTQNQPLTHFVL YSS 
40 CGCCGCCGCCACCCTCGGCAGCCCCGGCCAAGCCAACTACGCCGCCGCCA 4 500 
AAATLGS. PGQANYAAA 
ACGCCTTCCTCGACGCCCTCGCCACCCACCGCCACACCCAAGGACAACCC 4 550 
NAFLDALATHRHTQGQP 
GCCACCACCATCGCCTGGGGCATGTGGCACACCACCACCACACTCACCAG 4 600 
45 ATT IAWGMWHTTTTLTS 

CCAACTCACCGACAGCGACCGCGACCGCATCCGCCGCGGCGGCTTCCTGC 4 650 

QLTDSDRDRIRRGGFL 
CGATCTCGGACGACGAGGGCATGC 
P.ISD'DEGM 

The Avrll^Oiol hybrid FK-506 PKS module 8 containing the AT domain of 
13 of rapanwcin is shown below. 
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GCATGCGGCTGTACGAGGCGGCACGGCGCACCGGAAGTCCCGTGGTGGTG 5 0 

MRLYEAARRTGS PVVV 
GCGGCCGCGCTCGACGACGCGCCGGACGTGCCGCTGCTGCGCGGGCTGCG 100 
AAALDDAPDVPLLRGLR 
5 GCGTACGACCGTCCGGCGTGCCGCCGTCCGGGAACGCTCTCTCGCCGACC 150 
RTTVRRAAVRERSLAD 
GCTCGCCGTGCTGCCCGACGACGAGCGCGCCGACGCCTCCCTCGCGTTCG 200 
RSPCCPTTSAPTPPSRS 
TCCTGGAACAGCACCGCCACCGTGCTCGGCCACCTGGGCGCCGAAGACAT 250 
10 SWNSTATVLGHLGA ED I 

CCCGGCGACGACGACGTTCAAGGAACTCGGCATCGACTCGCTCACCGCGG 300 

PATTTFKELGI DSLTA 
TCCAGCTGCGCAACGCGCTGACCACGGCGACCGGCGTACGCCTCAACGCC 350 
VQLRNALTTATGVRLNA 
15 ACAGCGGTCTTCGACTTTCCGACGCCGCGCGCGCTCGCCGCGAGACTCGG 4 00 
TAVFDFPT PRALAARLG 
CGACGAGCTGGCCGGTACCCGCGCGCCCGTCGCGGCCCGGACCGCGGCCA 4 50 
DELAGTRAPVAARTAA 
q CCGCGGCCGCGCACGACGAACCGCTGGCGATCGTGGGCATGGCCTGCCGT 500 

*jg 20 TAAAHDE PLAIVGMACR 
"if CTGCCGGGCGGGGTCGCGTCGCCACAGGAGCTGTGGCGTCTCGTCGCGTC 550 

* LPGGVAS PQELWR LVAS 

^ CGGCACCGACGCCATCACGGAGTTCCCCGCGGACCGCGGCTGGGACGTGG 600 

O GT DAI TE FPADRGWDV 

UJ 25 ACGCGCTCTACGACCCGGACCCCGACGCGATCGGCAAGACCTTCGTCCGG 650 
M> D A L Y D P.D P D A I G K T F V R 

ffl CACGGCGGCTTCCTCGACGGTGCGACCGGCTTCGACGCGGCGTTCTTCGG 7 00 

a HGGFLDGAT G FDAAFFG 

p GATCAGCCCGCGCGAGGCCCTGGCCATGGACCCGCAGCAACGGGTGCTCC 7 50 

gg 30 ISPREALAMDPQQRVL 
Jy TGGAGACGTCCTGGGAGGCGTTCGAAAGCGCGGGCATCACCCCGGACGCG 800 

LETSWEAFESAGITPDA 
^ GCGCGGGGCAGCGACACCGGCGTGTTCATCGGCGCGTTCTCCTACGGGT A 850 

O ARGSDTGV 'F IGAFSYGY 

N» 35 CGGCACGGGTGCGGATACCAACGGCTTCGGCGCGACAGGGTCGCAGACCA 900 
GTGADTNG FGATGSQ T 
GCGTGCTCTCCGGCCGCCTCTCGTACTTCTACGGTCTGGAGGGCCCTTCG 950 
SVLSGRLSYFYGLEGPS 
GTCACGGTCGACACCGCCTGCTCGTCGTCACTGGTCGCCCTGCACCAGGC 1000 
40 VTVDT.A CSS SLVALHQA 

AGGGCAGTCCCTGCGCTCGGGCGAATGCTCGCTCGCCCTGGTCGGCGGTG 1050 

GQSLRSGECSLALVGG 
TCACGGTGATGGCGTCGCCCGGCGGATTCGTCGAGTTCTCCCGGCAGCGC 1100 
VTVMASPGGFVEFSRQR 
45 GGGCTCGCGCCGGACGGGCGGGCGAAGGCGTTCGGCGCGGGCGCGGACGG 1150 
GLAPDGRAKAFGAGADG 
TACGAGCTTCGCCGAGGGCGCCGGTGCCCTGGTGGTCGAGCGGCTCTCCG 1200 

TSFAEGAGALVVERLS 
ACGCGGAGCGCCACGGCCACACCGTCCTCGCCCTCGTACGCGGCTCCGCG 1250 
50 DAERHGHTVLALVRGSA 

GCTAACTCCGACGGCGCGTCGAACGGTCTGTCGGCGCCGAACGGCCCCTC 1300 

ANSDGASNGLSAPNGPS 
CCAGGAACGCGTCATCCACCAGGCCCTCGCGAACGCGAAACTCACCCCCG 1350 
QERVI HQALANAKLTP 
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CCGATGTCGACGCGGTCGAGGCGCACGGCACCGGCACCCGCCTCGGCGAC 14 00 
ADVDAVEAHGTGTRLGD 
CCCATCGAGGCGCAGGCGCTGCTCGCGACGTACGGACAGGACCGGGCGAC 14 50 
PIEAQALLATYGQDRAT 
5 GCCCCTGCTGCTCGGCTCGCTGAAGTCGAACATCGGGCACGCCCAGGCCG 1500 
PLLLGSLKSNIGHAQA 
CGTCAGGGGTCGCCGGGATCATCAAGATGGTGCAGGCCATCCGGCACGGG 1550 
ASGVAG'I IKMVQAI RHG 
GAACTGCCGCCGACACTGCACGCGGACGAGCCGTCGCCGCACGTCGACTG 1600 
10 ELP PTLHAD E PS PHVDW 

GACGGCCGGTGCCGTCGAGCTCCTGACGTCGGCCCGGCCGTGGCCGGGGA 1650 

TAGAVELLTSARPWPG 
CCGGTCGCCCTAGGCGGGCGGGCGTGTCGTCCTTCGGAGTCAGCGGCACC 1700 
TGRPRRAGVSS FGVSGT 
1 5 AACGCCCACGTCATCCTGGAGAGCGCACCCCCCGCTCAGCCCGCGGAGGA 1750 
NAHVI LESAPPAQ.PA.EE 
GGCGCAGCCTGTTGAGACGCCGGTGGTGGCCTCGGATGTGCTGCCGCTGG 1800 
AQPVETPVVA-SDVLPL 
Q TGATATCGGCCAAGACCCAGCCCGCCCTGACCGAACACGAAGACCGGCTG 1850 

j 20 VISAKTQPALTEHEDR.L 
yg CGCGCCTACCTGGCGGCGTCGCCCGGGGCGGATATACGGGCTGTGGCATC 1900 

jp RAYLAAS PGADI RAVAS 

J! GACGCTGGCGGTGACACGGTCGGTGTTCGAGCACCGCGCCGTACTCCTTG 1950 

P T LAVT RSV.FEHRAVLL 

W 25 GAGATGACACCGTCACCGGCACCGCGGTGACCGACCCCAGGATCGTGTTT 2000 
H GDD.TVTGTA VTDPRIVF 

OH GTCTTTCCCGGGCAGGGGTGGCAGTGGCTGGGGATGGGCAGTGCACTGCG 2050 

5 VFPGQGWQWLGMGSALR 
Q CGATTCGTCGGTGGTGTTCGCCGAGCGGATGGCCGAGTGTGCGGCGGCGT 21 0 0 

^30 DSSVVFAERMAECAAA 
^jj TGCGCGAGTTCGTGGACTGGGATCTGTTCACGGTTCTGGATGATCCGGCG 2150 

LREFVDWDLFTVLDDPA 
GTGGTGGACCGGGTTGATGTGGTCCAGCCCGCTTCCTGGGCGATGATGGT 2200 
VVDRVDVVQPASWAMMV 
35 TTCCCTGGCCGCGGTGTGGCAGGCGGCCGGTGTGCGGCCGGATGCGGTGA 2250 
S LAAVWQAAGVRP DAV 
TCGGCCATTCGCAGGGTGAGATCGCCGCAGCTTGTGTGGCGGGTGCGGTG 2300 
I GHSQGE IAAACVA GAV 
TCACTACGCGATGCCGCCCGGATCGTGACCTTGCGCAGCCAGGCGATCGC 2350 
40 SLRDAARIVTLRSQAIA 

CCGGGGCCTGGCGGGCCGGGGCGCGATGGCATCCGTCGCCCTGCCCGGGC 24 00 

RGLAGRGAMASVALPA 
AGGATGTCGAGCTGGTCGACGGGGCCTGGATCGCCGCCCACAACGGGCCC 2450 
QDVELVDGAWIAAHNGP 
45 GCCTCCACCGTGATCGCGGGCACCCCGGAAGCGGTCGACCATGTCCTCAC 2500 
ASTVIAGTPEAVDHVLT 
CGCTCATGAGGCACAAGGGGTGCGGGTGCGGCGGATCACCGTCGACTATG 2550 

AHEAQGVRVRRI TVDY 
CCTCGCACACCCCGCACGTCGAGCTGATCCGCGACGAACTACTCGACATC 2600 
50 ASHTPHVELIRDELLDI 

ACTAGCGACAGCAGCTCGCAGACCCCGCTCGTGCCGTGGCTGTCGACCGT 2 650 

TSDSSSQTPLVPWLSTV 
GGACGGCACCTGGGTCGACAGCCCGCTGGACGGGGAGTACTGGTACCGGA 27 00 
DGTWVDSPLDGEYWYR 
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ACCTGCGTGAACCGGTCGGTTTCCACCCCGCCGTCAGCCAGTTGCAGGCC 2750 
N LRE PVG FH PAVS QLQA 
CAGGGCGACACCGTGTTCGTCGAGGTCAGCGCCAGCCCGGTGTTGTTGCA 2800 
QGDTVFVEVSA SPV. LLQ 
5 GGCGATGGACGACGATGTCGTCACGGTTGCCACGCTGCGTCGTGACGACG 2850 
AMDDDVVTVATLRRDD 
GCGACGCCACCCGGATGCTCACCGCCCTGGCACAGGCCTATGTCCACGGC 2900 
GDATRMLTALAQAYVHG 
GTCACCGTCGACTGGCCCGCCATCCTCGGCACCACCACAACCCGGGTACT 2950 
10 VTVDWPAILGTTTTRVL 

ggaccttccgacctacgccttccaacaccagcggtactggctcgagtcgg 3000 

dlptyafqhqrywl.es 
ctcccccggccacggccgactcgggccaccccgtcctcggcaccggagtc 3050 

APPATADSGHPVLGTGV 
1 5 GCCGTCGCCGGGTCGCCGGGCCGGGTGTTCACGGGTCCCGTGCCCGCCGG 3100 
AVAGS PGRVFTGPVPAG 
TGCGGACCGCGCGGTGTTCATCGCCGAACTGGCGCTCGCCGCCGCCGACG 3150 
^ ADRAV F IAELALAAAD 

Q CCACCGACTGCGCCACGGTCGAACAGCTCGACGTCACCTCCGTGCCCGGC 3200 

iff 20 ATDCATVEQLDVTSVPG 

GG ATCCGCCCGCGGCAGGGCCACCGCGCAGACCTGGGTCGATGAACCCGC 3250 
GSARGRATAQTWVDEPA 
p CGCCGACGGGCGGCGCCGCTTCACCGTCCACACCCGCGTCGGCGACGCCC 3300 

h ADG RRRFTV.HTRV GDA 

*fj 25 CGTGGACGCTGCACGCCGAGGGGGTTCTCCGCCCCGGCCGCGTGCCCCAG 3350 
PWTLH AEGV.LRPGRVPQ 
CCCGAAGCCGTCGACACCGCCTGGCCCCCGCCGGGCGCGGTGCCCGCGGA 3400 
PEAVDTAWPPPGAVPAD 
Q CGGGCTGCCCGGGGCGTGGCGACGCGCGGACCAGGTCTTCGTCGAAGCCG 34 50 

09 30 GLPGAWRRADQVFVEA 

AAGTCGACAGCCCTGACGGCTTCGTGGCACACCCCGACCTGCTCGACGCG 3500 
EVDSPDGFVAHPDLLDA 
GTCTTCTCCGCGGTCGGCGACGGGAGCCGCCAGCCGACCGGATGGCGCGA 3550 
VFSAVGDGSRQPTGWRD 
35 CCTCGCGGTGCACGCGTCGGACGCCACCGTGCTGCGCGCCTGCCTCACCC 3600 
LAVHAS DATVLRACLT 
GCCGCGACAGTGGTGTCGTGGAGCTCGCCGCCTTCGACGGTGCCGGAATG 3650 
RR'DSGVVELAAFDGAGM 
CCGGTGCTCACCGCGGAGTCGGTGACGCTGGGCGAGGTCGCGTCGGCAGG 3700 
40 PVLTAESVTLGEVASAG 

CGGATCCGACGAGTCGGACGGTCTGCTTCGGCTTGAGTGGTTGCCGGTGG 3750 

GSDESDGLLR LEWLPV 
CGGAGGCCCACTACGACGGTGCCGACGAGCTGCCCGAGGGCTACACCCTC" 3800 
AEAHYDGADELPEGYTL 
45 ATCACCGCCACACACCCCGACGACCCCGACGACCCCACCAACCCCCACAA 3850 
ITATHPDDPDDPTNPHN 
CACACCCACACGCACCCACACACAAACCACACGCGTCCTCACCGCCCTCC 3900 

T PTRTHTQTTRVLTAL 
AACACCACCTCATCACCACCAACCACACCCTCATCGTCCACACCACCACC 3950 
50 QHHLITTNHT'LIVHTTT 

GACCCCCCAGGCGCCGCCGTCACCGGCCTCACCCGCACCGCACAAAACGA 4 000 

DPPGAAVTGLTRTAQNE 
ACACCCCGGCCGCATCCACCTCATCGAAACCCACCACCCCCACACCCCAC 4 050 
HPGRIHLIETHHPHTP 
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TCCCCCTCACCCAACTCACCACCCTCCACCAACCCCACCTACGCCTCACC 4100 
LPLTQLT'TLHQPHLRLT 
AACAACACCCTCCACACCCCCCACCTCACCCCCATCACCACCCACCACAA 4150 
NNTLHTPHLTPITTHHN 
5 CACCACCACAACCACCCCCAACACCCCACCCCTCAACCCCAACCACGCCA 4200 
TTTTTPNTPPLNPNHA 
TCCTCATCACCGGCGGCTCCGGCACCCTCGCCGGCATCCTCGCCCGCCAC 4250 
ILITGG. SGTLAGILARH 
CTCAACCACCCCCACACCTACCTCCTCTCCCGCACACCACCACCCCCCAC 4 300 
10 LNHPHTYLLSRT PPPPT 

CACACCCGGCACCCACATCCCCTGCGACCTCACCGACCCCACCCAAATCA 4 350 

TPGTHIPCDLTDPTQI 
CCCAAGCCCTCACCCACATACCACAACCCCTCACCGGCATCTTCCACACC 4 400 
TQALTHIPQPLTGIFHT 
1 5 GCCGCCACCCTCGACGACGCCACCCTCACCAACCTCACCCCCCAACACCT 4 4 50 
AATLDDATLTNLTPQHL 
CACCACCACCCTCCAACCCAAAGCCGACGCCGCCTGGCACCTCCACCACC 4 500 

TTTLQPKADAAWHLHH 
ACACCCAAAACCAACCCCTCACCCACTTCGTCCTCTACTCCAGCGCCGCC 4 550 
20 HTQNQPLTHFVLYSSAA 

GCCACCCTCGGCAGCCCCGGCCAAGCCAACTACGCCGCCGCCAACGCCTT 4 600 

ATLGSPGQANYAAANAF 
CCTCGACGCCCTCGCCACCCACCGCCACACCCAAGGACAACCCGCCACCA 4 600 
LD ALATH. RH TQGQ PAT 
25 CCATCGCCTGGGGCATGTGGCACACCACCACCACACTCACCAGCCAACTC 4 700 
TIAW GM WHT TTTLT SQL 
ACCGACAGCGACCGCGACCGCATCCGCCGCGGCGGCTTCCTGCCGATCTC '4750 

TDSDR 'DRIRRGGFLPI S 
GGACGACGAGGGCATGC 
30 D D E G M 



to / The Nl ^ cho1 h y brid FK-506 PKS module 8 containing the AT domain of 



GCATGCGGCTGTACGAGGCGGCACGGCGCACCGGAAGTCCCGTGGTGGTG 5 0 
35 MRLYEAARRTGS PVVV 

GCGGCCGCGCTCGACGACGCGCCGGACGTGCCGCTGCTGCGCGGGCTGCG 100 

AAALDDAPDVPLLRGLR 
GCGTACGACCGTCCGGCGTGCCGCCGTCCGGGAACGCTCTCTCGCCGACC 150 
RTTVRRAAVRERS LAD 
40 GCTCGCCGTGCTGCCCGACGACGAGCGCGCCGACGCCTCCCTCGCGTTCG 200 
RSPCCPTTSAPTPPSRS 
TCCTGGAACAGCACCGCCACCGTGCTCGGCCACCTGGGCGCCGAAGACAT 250 

SWNSTATVLGHLGAEDI 
CCCGGCGACGACGACGTTCAAGGAACTCGGCATCGACTCGCTCACCGCGG 300 
45 PATTTFKELGIDSLTA 

TCCAGCTGCGCAACGCGCTGACCACGGCGACCGGCGTACGCCTCAACGCC 350 
VQLR NALTTATGVRLNA 
ACAGCGGTCTTCGACTTTCCGACGCCGCGCGCGCTCGCCGCGAGACTCGG 4 00 
TAVFDFPT PRALAARLG 
50 CGACGAGCTGGCCGGTACCCGCGCGCCCGTCGCGGCCCGGACCGCGGCCA 4 50 
DELAGTRAPVAARTAA 
CCGCGGCCGCGCACGACGAACCGCTGGCGATCGTGGGCATGGCCTGCCGT 500 




cin is shown below. 
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TAAAHDEPLA .IVGMACR 
CTGCCGGGCGGGGTCGCGTCGCCACAGGAGCTGTGGCGTCTCGTCGCGTC 550 

LPGGVAS PQELWRLVAS 
CGGCACCGACGCCATCACGGAGTTCCCCGCGGACCGCGGCTGGGACGTGG 600 
5 GT DAITEFPADRGWDV 

ACGCGCTCTACGACCCGGACCCCGACGCGATCGGCAAGACCTTCGTCCGG 650 
D ALY D P D P DA I GKT FVR 
CACGGCGGCTTCCTCGACGGTGCGACCGGCTTCGACGCGGCGTTCTTCGG 700 
HGGFLDGATGFDAAFFG 
1 0 GATCAGCCCGCGCGAGGCCCTGGCCATGGACCCGCAGCAACGGGTGCTCC 750 
ISPREALAMDPQQRVL 
TGGAGACGTCCTGGGAGGCGTTCGAAAGCGCGGGCATCACCCCGGACGCG 800 
LETSWEAFESAGI. TPDA 
GCGCGGGGCAGCGACACCGGCGTGTTCATCGGCGCGTTCTCCTACGGGTA 850 
15 ARGSDTGVFIGAFSYGY 

CGGCACGGGTGCGGATACCAACGGCTTCGGCGCGACAGGGTCGCAGACCA 900 

GTGADTNGFGATGSQT 
GCGTGCTCTCCGGCCGCCTCTCGTACTTCTACGGTCTGGAGGGCCCTTCG 950 
O SVLSGRLSYFYGLEGPS 
ifj 20 GTCACGGTCGACACCGCCTGCTCGTCGTCACTGGTCGCCCTGCACCAGGC 1000 
VTVDTACSSSLVALHQA 
AGGGCAGTCCCTGCGCTCGGGCGAATGCTCGCTCGCCCTGGTCGGCGGTG 1050 

GQSLRSGECSLALVGG 
TCACGGTGATGGCGTCGCCCGGCGGATTCGTCGAGTTCTCCCGGCAGCGC 1100. 
V T V "M A S P G G F V E F S R Q R 
^ GGGCTCGCGCCGGACGGGCGGGCGAAGGCGTTCGGCGCGGGCGCGGACGG 1150 

GLAPDGRAKAFGAGADG 
TACGAGCTTCGCCGAGGGCGCCGGTGCCCTGGTGGTCGAGCGGCTCTCCG 1200 
Q TSFAEGAGALVVER 'LS 

S3 30 ACGCGGAGCGCCACGGCCACACCGTCCTCGCCCTCGTACGCGGCTCCGCG 1250 
fy DAERHGHTVLALVRGSA 
V | GCT AACTCCGACGGCGCGTCGAACGGTCTGTCGGCGCCGAACGGCCCCTC 1300 

p ANSDGA.SNGLSAPNGPS 
ff CCAGGAACGCGTCATCCACCAGGCCCTCGCGAACGCGAAACTCACCCCCG 1350 

^35 QERVI HQALA NAKLT P 

CCGATGTCGACGCGGTCGAGGCGCACGGCACCGGCACCCGCCTCGGCGAC 1400 
ADVDAVEAHGTGTRLGD 
CCCATCGAGGCGCAGGCGCTGCTCGCGACGTACGGACAGGACCGGGCGAC 14 50 
PI EAQALLATYGQDRAT 
40 GCCCCTGCTGCTCGGCTCGCTGAAGTCGAACATCGGGCACGCCCAGGCCG 1500 
PLLLGSLKSNIGHAQA 
CGTCAGGGGTCGCCGGGATCATCAAGATGGTGCAGGCCATCCGGCACGGG 1550 
ASGVAGI I KMVQAI RHG 
GAACTGCCGCCGACACTGCACGCGGACGAGCCGTCGCCGCACGTCGACTG 1600 
45 ELPPTLHADEPSPHVDW 

GACGGCCGGTGCCGTCGAGCTCCTGACGTCGGCCCGGCCGTGGCCGGGGA 1650 

TAGAVE L L T S AR PW P G 
CCGGTCGCCCGCGCCGCGCTGCCGTCTCGTCGTTCGGCGTGAGCGGCACG 1700 
TGRPRRAAVSSFGVSGT 
50 AACGCCCACATCATCCTTGAGGCAGGACCGGTCAAAACGGGACCGGTCGA 1750 
NAHI ILEAGPVKTGPVE 
GGCAGGAGCGATCGAGGCAGGACCGGTCGAAGTAGGACCGGTCGAGGCTG 1800 

AGAIEAGPVEVGPVEA 
GACCGCTCCCCGCGGCGCCGCCGTCAGCACCGGGCGAAGACCTTCCGCTG 1850 
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GPLPAAPPSAPGEDLPL 
CTCGTGTCGGCGCGTTCCCCGGAGGCACTCGACGAGCAGATCGGGCGCCT 1900 

LV-SARSPEALDEQIGRL 
GCGCGCCTATCTCGACACCGGCCCGGGCGTCGACCGGGCGGCCGTGGCGC 1950 
5 RAYLDTGPGVDRAAVA 

AGACACTGGCCCGGCGTACGCACTTCACCCACCGGGCCGTACTGCTCGGG 2000 
QTLARRTHFTHRAVLLG 
GACACCGTCATCGGCGCTCCCCCCGCGGACCAGGCCGACGAACTCGTCTT 2050 
D T V I GA P PA DQADE LV F 
10 CGTCTACTCCGGTCAGGGCACCCAGCATCCCGCGATGGGCGAGCAGCTAG 2100 
VYSGQGTQHPAMGEQL 
CCGCCGCGTTCCCCGTCTTCGCGCGGATCCATCAGCAGGTGTGGGACCTG 2150 
AAAFPVFARI HQQVWDL 
CTCGATGTGCCCGATCTGGAGGTGAACGAGACCGGTTACGCCCAGCCGGC 2200 
15 LDVPDLEVNETGYAQPA 

CCTGTTCGCAATGCAGGTGGCTCTGTTCGGGCTGCTGGAATCGTGGGGTG 2250 

LFAM. Q'VALFGLLESWG 
TACGACCGGACGCGGTGATCGGCCATTCGGTGGGTGAGCTTGCGGCTGCG 2300 
_ VRPDAVIGHSVGELAAA 

20 TATGTGTCCGGGGTGTGGTCGTTGGAGGATGCCTGCACTTTGGTGTCGGC 2350 
% YVSGVWSLE DACTLVSA 

g GCGGGCTCGTCTGATGCAGGCTCTGCCCGCGGGTGGGGTGATGGTCGCTG 24 00 

^ RARLMQALPAGGVMVA 
^ TCCCGGTCTCGGAGGATGAGGCCCGGGCCGTGCTGGGTGAGGGTGTGGAG 24 50 

W 25 V.PVSEDEA. RAVLG EG.VE 

ATCGCCGCGGTCAACGGCCCGTCGTCGGTGGTTCTCTCCGGTGATGAGGC 2500 

IAAVNGPS. SVVLSGDEA 
CGCCGTGCTGCAGGCCGCGGAGGGGCTGGGGAAGTGGACGCGGCTGGCGA 2550 
AVLQAAEGLGKWTRLA 
30 CCAGCCACGCGTTCCATTCCGCCCGTATGGAACCCATGCTGGAGGAGTTC 2 600 
TSHAFHSARMEPMLEEF 
CGGGCGGTCGCCGAAGGCCTGACCTACCGGACGCCGCAGGTCTCCATGGC 2 650 

RAVAEGLTYRTPQVSMA 
CGTTGGTGATCAGGTGACCACCGCTGAGTACTGGGTGCGGCAGGTCCGGG 2700 
35 VGDQVTTAEYWVRQVR 

ACACGGTCCGGTTCGGCGAGCAGGTGGCCTCGTACGAGGACGCCGTGTTC 2750 
DTVRFGEQVASYE DAVF 
GTCGAGCTGGGTGCCGACCGGTCACTGGCCCGCCTGGTCGACGGTGTCGC 2800 
VELGADRSLARLVDGVA 
40 GATGCTGCACGGCGACCACGAAATCCAGGCCGCGATCGGCGCCCTGGCCC 2850 
MLHGDHEI.QAAIGALA 
ACCTGTATGTCAACGGCGTCACGGTCGACTGGCCCGCGCTCCTGGGCGAT 2 900 
HLYVNGVTVDWPALLGD 
GCTCCGGCAACACGGGTGCTGGACCTTCCGACATACGCCTTCCAGCACCA 2950 
45 APATRVLDLPTYAFQHQ 

GCGCTACTGGCTCGAGTCGGCTCCCCCGGCCACGGCCGACTCGGGCCACC 3000 

RYWLES APPATADSGH 
CCGTCCTCGGCACCGGAGTCGCCGTCGCCGGGTCGCCGGGCCGGGTGTTC 3050 
PVLGTGVAVAGSPGRVF 
50 ACGGGTCCCGTGCCCGCCGGTGCGGACCGCGCGGTGTTCATCGCCGAACT 3100 
TGPVPAGADRAVFIAEL 
GGCGCTCGCCGCCGCCGACGCCACCGACTGCGCCACGGTCGAACAGCTCG 3150 

ALAAADAT DCATVEQL 
ACGTCACCTCCGTGCCCGGCGGATCCGCCCGCGGCAGGGCCACCGCGCAG 3200 
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DVTSVPGGSARGRATAQ 
ACCTGGGTCGATGAACCCGCCGCCGACGGGCGGCGCCGCTTCACCGTCCA 3250 

TWVDEPAADGRRRFTVH 
CACCCGCGTCGGCGACGCCCCGTGGACGCTGCACGCCGAGGGGGTTCTCC 3300 
5 TRVGD APWTLHAEGVL 

GCCCCGGCCGCGTGCCCCAGCCCGAAGCCGTCGACACCGCCTGGCCCCCG 3350 
RPGRVPQPEAVDTAWPP 
CCGGGCGCGGTGCCCGCGGACGGGCTGCCCGGGGCGTGGCGACGCGCGGA 34 00 
PGAVPADGL PGAWRRAD 
1 0 CCAGGTCTTCGTCGAAGCCGAAGTCGACAGCCCTGACGGCTTCGTGGCAC 34 50 
QVFVEAEVDS PDGFVA 
ACCCCGACCTGCTCGACGCGGTCTTCTCCGCGGTCGGCGACGGGAGCCGC 3500 
HPDLLDAVFSAVGDGSR 
CAGCCGACCGGATGGCGCGACCTCGCGGTGCACGCGTCGGACGCCACCGT 3550 
15 QPTGWRD LAVHASDATV 

GCTGCGCGCCTGCCTCACCCGCCGCGACAGTGGTGTCGTGGAGCTCGCCG 3600 
LRACLTRRDSGVVELA 
' CCTTCGACGGTGCCGGAATGCCGGTGCTCACCGCGGAGTCGGTGACGCTG 3650 
Q AFDGAGMPVLTAESVTL 

20 GGCGAGGTCGCGTCGGCAGGCGGATCCGACGAGTCGGACGGTCTGCTTCG 3700 
GEVASAGGSDESDGLLR 
GCTTGAGTGGTTGCCGGTGGCGGAGGCCCACTACGACGGTGCCGACGAGC 3750 
g LEWLPVAEAHYDGADE 
JH TGCCCGAGGGCTACACCCTCATCACCGCCACACACCCCGACGACCCCGAC 3800 

25 L P E . G • Y T L I T A T H P D D P D 

: GACCCCACCAACCCCCACAACACACCCACACGCACCCACACACAAACCAC 3850 
DPTNPHNTPTRTHTQTT 
* ACGCGTCCTCACCGCCCTCCAACACCACCTCATCACCACCAACCACACCC 3900 

Q RVLTALQHHLITTNHT 
© 30 TCATCGTCCACACCACCACCGACCCCCCAGGCGCCGCCGTCACCGGCCTC 3950 
ffj LIVHTTTDPPGAAVTGL 

ACCCGCACCGCACAAAACGAACACCCCGGCCGCATCCACCTCATCGAAAC 4 000 
Q TRTAQNEHPGRIHLIET 
I . CCACCACCCCCACACCCCACTCCCCCTCACCCAACTCACCACCCTCCACC 4 050 

^ 35 HHPHTPL PLTQLTTLH 

AACCCCACCTACGCCTCACCAACAACACCCTCCACACCCCCCACCTCACC 4100 
QPHLRLTNNTLHTPHLT 
CCCATCACCACCCACCACAACACCACCACAACCACCCCCAACACCCCACC 4150 
PITTHHNTTTTTPNTPP 
40 CCTCAACCCCAACCACGCCATCCTCATCACCGGCGGCTCCGGCACCCTCG 4 200 
LNPNHAILITGGSGTL 
CCGGCATCCTCGCCCGCCACCTCAACCACCCCCACACCTACCTCCTCTCC 4250 
AGILARHLNHPHTYLLS 
CGCACACCACCACCCCCCACCACACCCGGCACCCACATCCCCTGCGACCT 4 300 
45 RTPPP.PTTPGTHIPCDL 

CACCGACCCCACCCAAATCACCCAAGCCCTCACCCACATACCACAACCCC 4350 

TDPTQITQALTHIPQP 
TCACCGGCATCTTCCACACCGCCGCCACCCTCGACGACGCCACCCTCACC 4 4 00 
LTGI FHTAATLDDATLT 

50 

AACCTCACCCCCCAACACCTCACCACCACCCTCCAACCCAAAGCCGACGC 4 4 50 

NLTPQHLTTTLQPKADA 
CGCCTGGCACCTCCACCACCACACCCAAAACCAACCCCTCACCCACTTCG 4 500 

AWHLHHHTQNQPLTHF 
TCCTCTACTCCAGCGCCGCCGCCACCCTCGGCAGCCCCGGCCAAGCCAAC 4550 
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VLYS SAAATLGS PG QAN 
TACGCCGCCGCCAACGCCTTCCTCGACGCCCTCGCCACCCACCGCCACAC 4 600 

YAAANAFLDALATHRHT 
CCAAGGACAACCCGCCACCACCATCGCCTGGGGCATGTGGCACACCACCA 4 650 

QGQPATT IAWGMWHTT 
CCACACTCACCAGCCAACTCACCGACAGCGACCGCGACCGCATCCGCCGC 4 700 
TTLT^SQLTDS DRDRIRR 
GGCGGCTTCCTGCCGATCTCGGACGACGAGGGCATGC 
GGFLPISDDEGM 

The Nh^-Xhol hybrid FK-506 PKS module 8 containing the AT domain of 
odule 13 of rapamycin is shown below. 

GCATGCGGCTGTACGAGGCGGCACGGCGCACCGGAAGTCCCGTGGTGGTG 5 0 
MRLYEAARRTGS PVVV 
15 GCGGCCGCGCTCGACGACGCGCCGGACGTGCCGCTGCTGCGCGGGCTGCG 100 
AAAL DDAP DVPLL RGLR 
GCGTACGACCGTCCGGCGTGCCGCCGTCCGGGAACGCTCTCTCGCCGACC 150 

RTTVRRAAVRERSLAD 
GCTCGCCGTGCTGCCCGACGACGAGCGCGCCGACGCCTCCCTCGCGTTCG 200 
_ 20 RSPCCPTTSAPTPPSRS 
HN TCCTGGAACAGCACCGCCACCGTGCTCGGCCACCTGGGCGCCGAAGACAT 250 

Q SWNS TATVLG HLGAE D I 

W ■ CCCGGCGACGACGACGTTCAAGGAACTCGGCATCGACTCGCTCACCGCGG 300 

N> P A T T T F K'E.L G I D S L T A 

fjf} 25 TCCAGCTGCGCAACGCGCTGACCACGGCGACCGGCGTACGCCTCAACGCC 350 
3 VQLRNAL.TTATGVRLNA 
p ACAGCGGTCTTCGACTTTCCGACGCCGCGCGCGCTCGCCGCGAGACTCGG 4 00 

« TAVFDFPT PRALAARLG' 

CGACGAGCTGGCCGGTACCCGCGCGCCCGTCGCGGCCCGGACCGCGGCCA 4 50 
f ^ 30 DELAGTRAPVAARTAA 
^ CCGCGGCCGCGCACGACGAACCGCTGGCGATCGTGGGCATGGCCTGCCGT 500 

Q TAAAHDE PLAIVGMACR 

M» CTGCCGGGCGGGGTCGCGTCGCCACAGGAGCTGTGGCGTCTCGTCGCGTC 550 

LPGGVASPQELWRLVAS 
35 CGGCACCGACGCCATCACGGAGTTCCCCGCGGACCGCGGCTGGGACGTGG 600 
GT DAI TE FP ADRGWDV 
ACGCGCTCTACGACCCGGACCCCGACGCGATCGGCAAGACCTTCGTCCGG 650 
DALYDPDPDAIGKTFVR 
CACGGCGGCTTCCTCGACGGTGCGACCGGCTTCGACGCGGCGTTCTTCGG 7 00 
40 HGGFLDGATGFDAAFFG 

GATCAGCCCGCGCGAGGCCCTGGCCATGGACCCGCAGCAACGGGTGCTCC 750 

I S PR EALAMDPQQRVL 
TGGAGACGTCCTGGGAGGCGTTCGAAAGCGCGGGCATCACCCCGGACGCG 800 
LETS WEAFESAGI TPDA 
45 GCGCGGGGCAGCGACACCGGCGTGTTCATCGGCGCGTTCTCCTACGGGTA 850 
ARGSDTGVFIGAFSYGY 
CGGCACGGGTGCGGATACCAACGGCTTCGGCGCGACAGGGTCGCAGACCA 900 

GTGADTNGFGATGSQT 
.GCGTGCTCTCCGGCCGCCTCTCGTACTTCTACGGTCTGGAGGGCCCTTCG 950 
50 SVLSGRLSYFYGLEGPS 

GTCACGGTCGACACCGCCTGCTCGTCGTCACTGGTCGCCCTGCACCAGGC 1000 
VTVDTACSSSLVALHQA 
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AGGGCAGTCCCTGCGCTCGGGCGAATGCTCGCTCGCCCTGGTCGGCGGTG 1050 

GQSLRSGECSLALVGG 
TCACGGTGATGGCGTCGCCCGGCGGATTCGTCGAGTTC'TCCCGGCAGCGC 1100 
V TVMASPGGFVEFSRQR 
5 GGGCTCGCGCCGGACGGGCGGGCGAAGGCGTTCGGCGCGGGCGCGGACGG 1150 
GLAPDGRAKAFGAGADG 
TACGAGCTTCGCCGAGGGCGCCGGTGCCCTGGTGGTCGAGCGGCTCTCCG 1200 

TSFAEGAGALVVERLS 
ACGCGGAGCGCCACGGCCACACCGTCCTCGCCCTCGTACGCGGCTCCGCG 1250 
10 DAERHGHTVLALVRGSA 

GCTAACTCCGACGGCGCGTCGAACGGTCTGTCGGCGCCGAACGGCCCCTC 1300 

ANSDGASNGLSAPNGPS 
CCAGGAACGCGTCATCCACCAGGCCCTCGCGAACGCGAAACTCACCCCCG 1350 
QERVI HQALANAKLT P 
15 CCGATGTCGACGCGGTCGAGGCGCACGGCACCGGCACCCGCCTCGGCGAC 14 00 
ADVDAVEAHGTGT RLGD 
CCCATCGAGGCGCAGGCGCTGCTCGCGACGTACGGACAGGACCGGGCGAC 14 50 
PIEAQALLATYGQDR .AT 
p GCCCCTGCTGCTCGGCTCGCTGAAGTCGAACATCGGGCACGCCCAGGCCG 1500 

20 PLLLGSLKSNIGHAQA 

CGTCAGGGGTCGCCGGGATCATCAAGATGGTGCAGGCCATCCGGCACGGG 1550 
ASGVAG I I KMVQAI RHG 
g GAACTGCCGCCGACACTGCACGCGGACGAGCCGTCGCCGCACGTCGACTG 1600 

J~ ELPPT. LHADEPSPHVDW 

25 GACGGCCGGTGCCGTCGAGCTCCTGACGTCGGCCCGGCCGTGGCCGGGGA" 1650 
T A G A . V E L L ' T S A R P W P G 
Ql CCGGTCGCCCGCGCCGCGCTGCCGTCTCGTCGTTCGGCGTGAGCGGCACG 17 00 

a TGRPRRAAVSSFGVSGT 
0 AACGCCCACATCATCCTTGAGGCAGGACCGGTCAAAACGGGACCGGTCGA 17 50 

30 NAHI ILEAGPVKTGPVE 

GGCAGGAGCGATCGAGGCAGGACCGGTCGAAGTAGGACCGGTCGAGGCTG 1800 

AGAIEAG PVEVGPVEA 
GACCGCTCCCCGGGGCGCCGCCGTCAGCACCGGGCGAAGACCTTCCGCTG 1850 
GPLPAAPPSAPGEDLPL 
35 CTCGTGTCGGCGCGTTCCCCGGAGGCACTCGACGAGCAGATCGGGCGCCT 1900 
LVSARSPEALDEQIGRL 
GCGCGCCTATCTCGACACCGGCCCGGGCGTCGACCGGGCGGCCGTGGCGC 1950 

RAYLDTG PGVDRAAVA 
AGACACTGGCCCGGCGTACGCACTTCACCCACCGGGCCGTACTGCTCGGG 2000 
40 QTLARRTH FTHRAVLLG 

GACACCGTCATCGGCGCTCCCCCCGCGGACCAGGCCGACGAACTCGTCTT 2050 

DTVIGAP PADQADELVF 
CGTCTACTCCGGTCAGGGCACCCAGCATCCCGCGATGGGCGAGCAGCTAG 2100 
VY'SGQGTQHPAMGEQL 
45 CCGATTCGTCGGTGGTGTTCGCCGAGCGGATGGCCGAGTGTGCGGCGGCG 2150 
A D S S VV FAE RMAE C A A • A 
TTGCGCGAGTTCGTGGACTGGGATCTGTTCACGGTTCTGGATGATCCGGC 2200 

LREFV DWDL FTV LDDPA 
GGTGGTGGACCGGGTTGATGTGGTCCAGCCCGCTTCCTGGGCGATGATGG 2250 
50 VVDRVDVVQPASWAMM 

TTTCCCTGGCCGCGGTGTGGCAGGCGGCCGGTGTGCGGCCGGATGCGGTG 2300 
VSLAAVWQAAGVRPDAV 
ATCGGCCATTCGCAGGGTGAGATCGCCGCAGCTTGTGTGGCGGGTGCGGT 2350 
IGHSQGEIAAACVAGAV 
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GTCACTACGCGATGCCGCCCGGATCGTGACCTTGCGCAGCCAGGCGATCG 24 00 

SLRDAARIVTLRSQAI 
CCCGGGGCCTGGCGGGCCGGGGCGCGATGGCATCCGTCGCCCTGCCCGCG 24 50 
ARG LAG RGAMAS V A L PA 
5 CAGGATGTCGAGCTGGTCGACGGGGCCTGGATCGCCGCCCACAACGGGCC 2500 
QDVELV'DGAWIAAHNGP 
CGCCTCCACCGTGATCGCGGGCACCCCGGAAGCGGTCGACCATGTCCTCA 2550 

ASTVIAGTPEAVDHVL 
CCGCTCATGAGGCACAAGGGGTGCGGGTGCGGCGGATCACCGTCGACTAT 2600 
10 TAHEAQGVRVRRITVDY 

GCCTCGCACACCCCGCACGTCGAGCTGATCCGCGACGAACTACTCGACAT 2650 

ASHT PHVELIRDELLDI 
CACTAGCGACAGCAGCTCGCAGACCCCGCTCGTGCCGTGGCTGTCGACCG 27 00 
TSDSSSQTPLVPWLST 
15 TGGACGGCACCTGGGTCGACAGCCCGCTGGACGGGGAGTACTGGTACCGG 27 50 
VDGTWVDS PLDGEYWYR 
AACCTGCGTGAACCGGTCGGTTTCCACCCCGCCGTCAGCCAGTTGCAGGC 28 00 
NLREPVGFHPAVSQLQA 
Q CCAGGGCGACACCGTGTTCGTCGAGGTCAGCGCCAGCCCGGTGTTGTTGC 2850 

•J 20 QGDTVFVEVSASPVLL 

AGGCGATGGACGACGATGTCGTCACGGTTGCCACGCTGCGTCGTGACGAC 2900 
p QAMDDDVVTVATLRRDD 
j£J GGCGACGCCACCCGGATGCTCACCGCCCTGGCACAGGCCTATGTCCACGG 2950 

G DAT RMLTALAQAYV HG 
CGTCACCGTCGACTGGCCCGCCATCCTCGGCACCACCACAACCCGGGTAC 3000 
_ V T V DW PA I L G TT T T R V 

31 TGGACCTTCCGACCTACGCCTTCCAACACCAGCGGTACTGGCTCGAGTCG 3050 

s L D L P T YA FQH QRY WL E S 

p GCTCCCCCGGCCACGGCCGACTCGGGCCACCCCGTCCTCGGCACCGGAGT 3100 

^30 APPATADSGHPVLGTGV 

CGCCGTCGCCGGGTCGCCGGGCCGGGTGTTCACGGGTCCCGTGCCCGCCG 3150 

AVAGS PGRVFTGPVPA 
GTGCGGACCGCGCGGTGTTCATCGCCGAACTGGCGCTCGCCGCCGCCGAC 3200 
G A D RAV F I AE LALAAA D 



35 GCCACCGACTGCGCCACGGTCGAACAGCTCGACGTCACCTCCGTGCCCGG 3250 
ATDCATVEQLDVTSVPG 
CGGATCCGCCCGCGGCAGGGCCACCGCGCAGACCTGGGTCGATGAACCCG 3300 

GSARGRATAQTWVDEP 
CGGCCGACGGGCGGCGCCGCTTCACCGTCCACACCCGCGTCGGCGACGCC 3350 
40 AADGRRRFTVHTRVGDA, 

CCGTGGACGCTGCACGCCGAGGGGGTTCTCCGCCCCGGCCGCGTGCCCCA 34 00 

PWTL HAEGVLRPGRVPQ 
GCCCGAAGCCGTCGACACCGCCTGGCCCCCGCCGGGCGCGGTGCCCGCGG 34 50 
PEAVDTAWPPPGAVPA 
45 ACGGGCTGCCCGGGGCGTGGCGACGCGCGGACCAGGTCTTCGTCGAAGCC 3500 
DGLPGAWRRADQVFVEA 
GAAGTCGACAGCCCTGACGGCTTCGTGGCACACCCCGACCTGCTCGACGC 3550 

EVDS PDGFVAHPDLLDA 
GGTCTTCTCCGCGGTCGGCGACGGGAGCCGCCAGCCGACCGGATGGCGCG 3600 
50 VFSAVGD GSRQPTGWR 

ACCTCGCGGTGCACGCGTCGGACGCCACCGTGCTGCGCGCCTGCCTCACC 3650 
DLAVHASDATVLRACLT 
CGCCGCGACAGTGGTGTCGTGGAGCTCGCCGCCTTCGACGGTGCCGGAAT 3700 
RRDSGVVELAAFDGAGM 



dc- 176500 





PATENT 

AttyDkt: 300622002600 



- 123 - 



5 

10 
15 

O 
□ 

W 25 

m 

Q 

m 30 
nj 

0' 

H 35 
40 
45 



50 



GCCGGTGCTCACCGCGGAGTCGGTGACGCTGGGCGAGGTCGCGTCGGCAG 37 50 

PVLTAE SVTLGEVASA 
GCGGATCCGACGAGTCGGACGGTCTGCTTCGGCTTGAGTGGTTGCCGGTG 3800 
GGSDESDGLLRLEWLPV 
GCGGAGGCCCACTACGACGGTGCCGACGAGCTGCCCGAGGGCTACACCCT 3850 

AEAHYDGADELPEGYTL 
CATCACCGCCACACACCCCGACGACCCCGACGACCCCACCAACCCCCACA 3900 

ITATHPDDPDDPTNPH 
ACACACCCACACGCACCCACACACAAACCACACGCGTCCTCACCGCeCTC 3950 
NTPTRTHTQTTRVLTAL 
CAACACCACCTCATCACCACCAACCACACCCTCATCGTCCACACCACCAC 4000 

QHHLITTNHTLIVHTTT 
CGACCCCCCAGGCGCCGCCGTCACCGGCCTCACCCGCACCGCACAAAACG 4050 

DPPGAAVTGLTRTAQN 
AACACCCCGGCCGCATCCACCTCATCGAAACCCACCACCCCCACACCCCA 4100 
EHPGRIHLIETHHPHT P 
CTCCCCCTCACCCAACTCACCACCCTCCACCAACCCCACCTACGCCTCAC 4150 

LPLTQLTTLHQPHLRLT 
CAACAACACCCTCCACACCCCCCACCTCACCCCCATCACCACCCACCACA 4200 

NNTLHTPHLTPITTHH 
ACACC ACC ACAACC ACCCCCAAC ACCCCACCCCTCAACCCCAACCACGCC 4 250 
NTTTTTPNTPPLNPNHA 
ATCCTCATCACCGGCGGCTCCGGCACCCTCGCCGGCATCCTCGCCCGCCA 4 300 

I LI T GG S G T LA G I LARH. 
CCTCAACCACCCCCACACCTACCTCCTCtCCCGCACACCACCACCCCCCA 4350 

LNHPHT YL LSRTPPPP 
CCACACCCGGCACCCACATCCCCTGCGACCTCACCGACCCCACCCAAATC 4 400 
TTPGTHI PCDLTDPTQI 
ACCCAAGCCCTCACCCACATACCACAACCCCTCACCGGCATCTTCCACAC 4 450 

TQALTHIPQPLTGIFHT 
CGCCGCCACCCTCGACGACGCCACCCTCACCAACCTCACCCCCCAACACC 4 500 

AATLDDATLTNLTPQH 
' TCACCACCACCCTCCAACCCAAAGCCGACGCCGCCTGGCACCTCCACCAC 4 550 
LTTTLQPKADAAWHLHH 
CACACCCAAAACCAACCCCTCACCCACTTCGTCCTCTACTCCAGCGCCGC 4 600 

HTQNQPLTHFVLYSSAA 
CGCCACCCTCGGCAGCCCCGGCCAAGCCAACTACGCCGCCGCCAACGCCT 4 650 

A T L G S PGQANY AAANA 
TCCTCGACGCCCTCGCCACCCACCGCCACACCCAAGGACAACCCGCCACC 4700 
FLDALATHRHTQGQPAT 
ACCATCGCCTGGGGCATGTGGCACACCACCACCACACTCACCAGCCAACT 4 7 50 

TIAWGMWHTTTTLTSQL 
CACCGACAGCGACCGCGACCGCATCCGCCGCGGCGGCTTCCTGCCGATCT 4800 

TDSDRDRIRRGGFLPI 
CGGACGACGAGGGCATGC 
S D D E G M 



Recombinant PKS Genes for 13-desmethoxy FK-506 and FK-520 
The present invention provides a variety of recombinant PKS genes in addition to 
those described in Examples 1 and 2 for producing 13-desmethoxy FK-506 and FK-520 
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compounds. This Example provides the construction protocols for recombinant FK-520 
and FK-506 (from Streptomyces sp. MA6858 (ATCC 55098), described in U.S. Patent 
Nos. 5,1 16,756, incorporated herein by reference) PKS genes in which the module 8 AT 
coding sequences have been replaced by either the rapAT3 (the AT domain from module 
3 of the rapamycin PKS), rapATll, eryATl (the AT domain from module 1 of the 
erythromycin (DEBS) PKS), or eryATl coding sequences. Each of these constructs 
provides a PKS that produces the 13-desmethoxy-13-methyl derivative, except for the 
rapAT12 replacement, which provides the 13-desmethoxy derivative, i.e., it has a 
hydrogen where the other derivatives have methyl. 

Figure 7 shows the process used to generate the AT replacement constructs. First, 
a fragment of -4.5 kb containing module 8 coding sequences from the FK-520 cluster of 
ATCC 14891 was cloned using the convenient restriction sites Sacl and Sphl (Step A in 
Figure 7). The choice of restriction sites used to clone a 4.0 - 4.5 kb fragment comprising 
module 8 coding sequences from other FK-520 or FK-506 clusters can be different 
depending on the DNA sequence, but the overall scheme is identical. The unique Sacl 
and Sphl restriction sites at the ends of the FK-520 module 8 fragment were then changed 
to unique Bgl II and Nsil sites by ligation to synthetic linkers (described in the preceding 
Examples, see Step B of Figure 7). Fragments containing sequences 5' and 3' of the AT8 
sequences were then amplified using primers, described above, that introduced either an 
Avrll site or an Nhel site at two different KS/AT boundaries and an Xhol site at the 
AT/DH boundary (Step C of Figure 7). Heterologous AT domains from the rapamycin 
and erythromycin gene clusters were amplified using primers, as described above, that 
introduced the same sites as just described (Step D of Figure 7). The fragments were 
ligated to give hybrid modules with in-frame fusions at the KS/AT and AT/DH 
boundaries (Step E of Figure 7). Finally, these hybrid modules were ligated into the 
BamKl and Pstl sites of the KC515 vector. The resulting recombinant phage were used to 
transform the FK-506 and FK-520 producer strains to yield the desired recombinant cells, 
as described in the preceding Examples. 
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The following table shows the location and sequences surrounding the engineered 
lie of each of the heterologous AT domains employed. The FK-506 hybrid construct was 
used as a control for tnbvFK-520 recombinant cells produced, and a similar FK-520 
hybrid construct was usea\as a control for the FK-506 recombinant cells. 



W 



Heterologous AT 



FK-506 AT8 
(hydroxymalonyl) 



rapamycin AT3 
(methylmalonyl) 



rapamycin AT 12 
(malonyl) 



DEBS ATI 
(methylmalonyl) 



DEBS AT2 
(methylmalonyl) 



Enzyme 



Avrll 
Nhel 
Xhol 



Avrll 
Nhel 
Xhol 



Avrll 
Nhel 
Xhol 



Avrll 
Nhel 
Xhol 



Avrll 
Nhel 
Xhol 



Location of Engineered Site 



GGCCGT ccgcgc CGTGCGGCGGTCTCGTCGTTC 
GRPRRAAVSSF 

ACCCAGCATCCCGCGATGGGTGAGCG qctcgc C 
TQH PAMGERLA 

TACGCCTTCCAGCGGCGGCCCTACTGG atcgag 
Y A F Q R R P Y W I E 



GACCGG ccccgt CGGGCGGGCGTGTCGTCCTTC 
DRPRRAGV SSF 

TGGCAGTGGCTGGGGATGGGCAGTGC cctgcg G 
WQWLGMGSALR 

TACGCCTTCCAACACCAGCGGTACTGG gtcgag 
Y A F Q H Q R Y W - V E 



GGCCGA gcgcgc CGGGCAGGCGTGTCGTCCTTC 
GRARRAGVSSF 

TCGCAGCGTGCTGGCATGGGTGAGG Aactggc C 
SQRAGMGE ELA 

TACGCCTTCCAGCACCAGCGCTACTGG ctcgag 
Y A F Q H Q R Y W L E 



GCGCGA ccqcgc CGGGCGGGGGTCTCGTCGTTC 
ARPRRAGVSSF 

TGGCAGTGGGCGGGCATGGCCGTCG Acctgct C 
WQWAGMAVDLL 

TACCCGTTCCAGCGCGAGCGCGTCTGG ctcgaa 
Y P F Q R E R V W L E 



GACGGG gtgcgc CGGGCAGGTGTGTCGGCGTTC 
DGVR/RAGVSAF 

GCCCAGTGGGAAGGCATGGCGCGGGA gttgtt G 
AQWEGMARELL 

TATCCTTTCCAGGGCAAGCGGTTCTGG ctgctg 
Y P F Q G K R F W L L 
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The sequences shown below provide the location of the KS/AT boundaries 
chosen in the FK^§20 module 8 coding sequences. Regions where Avrll and Nhel sites 
were engineered areSindicated by lower case and underlining. 

CCGGCGCCGTCGAACTGCTGACGTCGGCCCGGCCGTGGCCCGAGACCGACCGG ccacgg C 

AG AVE LLT SARPW P ET D R P R 

GTGCCGCCGTCTCCTCGTTCGGGGTGAGCGGCACCAACGCCCACGTCATCCTGGAGGCCG 

RAAVS S FGVSGTNAHV.I LEA 

GACCGGTAACGGAGACGCCCGCGGCATCGCCTTCCGGTGACCTTCCCCTGCTGGTGTCGG 

GPVTETPAASPSGDLPLLVS 

CACGCTCACCGGAAGCGCTCGACGAGCAGATCCGCCGACTGCGCGCCTACCTGGACACCA 

ARSPEALDEQIRRLRAYLDT 

CCCCGGACGTCGACCGGGTGGCCGTGGCACAGACGCTGGCCCGGCGCACACACTTCGCCC 

TPDVDRVAVAQTLARRTH FA 

ACCGCGCCGTGCTGCTCGGTGACACCGTCATCACCACACCCCCCGCGGACCGGCCCGACG 

HRAVLLGDTVITTP PADRPD 

AACTCGTCTTCGTCTACTCCGGCCAGGGCACCCAGCATCCCGCGATGGGCGAGC Aqctcq 

ELVFVYSGQGTQHPAMGEQL 

cCGCCGCCCATCCCGTGTTCGCCGACGCCTGGCATGAAGCGCTCCGCCGCCTTGACAACC 

AAAH PV FADA 'WHEALRRL DN 

^he seqvieni^es shown below provide the location of the AT/DH boundary chosen 
in the FK-520 moduffe 8 coding sequences. The region where an Xhol site was 
engineered is indicated\>y lower case and underlining. 

TCCTCGGGGCTGGGTCACGGCACGACGCGGATGTGCCCGCGTACGCGTTCCAACGGCGGC 
I LGAGSRHDADVPAYAFQRR 
ACTACTGG atcgaq TCGGCACGCCCGGCCGCATCCGACGCGGGCCACCCCGTGCTGGGCT 
HYWI E SARPAAS DAGH PVLG 

^Jp^ The sequen^s shown below provide the location of the KS/AT boundaries 
ohosen in the FK-506\iodule 8 coding sequences. Regions where ^4vrII and Nhel sites 
were engineered are indicted by lower case and underlining. ^ 

TCGGCCAGGCCGTGGCCGCGGACCGGCCGT ccgcgc CGTGCGGCGGTCTCGTCGTTCGGG 

SARPWPRTGRPRRAAVS S FG 
GTGAGCGGCACCAACGCCCACATCATCCTGGAGGCCGGACCCGACCAGGAGGAGCCGTCG 

VSGTNAH I I LEAG P DQE E P S 
GCAGAACCGGCCGGTGACCTCCCGCTGCTCGTGTCGGCACGGTCCCCGGAGGCACTGGAC 

AEPAGDLPLLVSARS PEALD 
GAGCAGATCGGGCGCCTGCGCGACTATCTCGACGCCGCCCCCGGCGTGGACCTGGCGGCC 

EQI GRLRDYLDAAPGVDLAA 
GTGGCGCGGACACTGGCCACGCGTACGCACTTCTCCCACCGCGCCGTACTGCTCGGTGAC 

VARTLATRTHFSHRAVLLGD 
ACCGTCATCACCGCTCCCCCCGTGGAACAGCCGGGCGAGCTCGTCTTCGTCTACTCGGGA 

TVITAPPVEQPGELVFVYSG 
CAGGGCACCCAGCATCCCGCGATGGGTGAGCG gctcgc CGCAGCCTTCCCCGTGTTCGCC 

QGTQHPAMGERLAAAFPVFA 
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SI 

O 



GACCCGGACGTACCCGCCTACGCCTTCCAGCGGCGGCCCTACTGGATCGAGTCCGCGCCG 
DPDVPAYAFQRRPYWI ESAP 

The secjuences shown below provide the location of the AT/DH boundary chosen 
in the FK-506 module 8 coding sequences. The region where anXhol site was 
engineered is indicated by lower case and underlining. 

GACCCGGACGTACCCGCCTACGCCTTCCAGCGGCGGCCCTACTGG atcgag TCCGCGCCG 
DPDVPAYAFQRRPYWIESAP 

10 Example 4 

Replacement of Methoxyl with Hydrogen or Methyl at C-15 of FK-506 and FK-520 
The methods and reagents of the present invention also provide novel FK-506 and 
FK-520 derivatives in which the methoxy group at C-15 is replaced by a hydrogen or 
methyl. These derivatives are produced in recombinant host cells of the invention that 

15 express recombinant PKS enzymes the produce the derivatives. These recombinant PKS 
enzymes are prepared in accordance with the methodology of Examples 1 and 2, with the 
exception that AT domain of module 7, instead of module 8, is replaced. Moreover, the 
present invention provides recombinant PKS enzymes in which the AT domains of both 
modules 7 and 8 have been changed. The table below summarizes the various compounds 

20 provided by the present invention. 



25 



30 



Compound 


C-13 


C-15 


Derivative Provided 


FK-506 


hydrogen 


hydrogen 


13, 15-didesmethoxy-FK-506 


FK-506 


hydrogen 


methoxy 


1 3-desmethoxy-FK-506 


FK-506 


hydrogen 


methyl 


13,1 5-didesmethoxy- 1 5-methyl-FK-506 


FK-506 


methoxy 


hydrogen 


1 5-desmethoxy-FK-506 


FK-506 


methoxy 


methoxy 


Original Compound — FK-506 


FK-506 


methoxy 


methyl 


1 5-desmethoxy-l 5-methyl-FK-506 


FK-506 


methyl 


hydrogen 


13,1 5-didesmethoxy- 1 3-methyl-FK-506 


FK-506 


methyl 


methoxy 


1 3-desmethoxy-l 3-methyl-FK-506 


FK-506 


methyl 


methyl 


13,1 5-didesmethoxy- 13,1 5-dimethyl-FK-506 


FK-520 


hydrogen 


hydrogen 


13,1 5-didesmethoxy FK-520 
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FK-520 



hydrogen 

hydrogen 

methoxy 

methoxy 

methoxy 

methyl 



methoxy 1 3-desmethoxy FK-520 

methyl 13,1 5-didesmethoxy- 1 5-methyl-FK-520 

hydrogen 1 5-desmethoxy-FK-520 

methoxy Original Compound - FK-520 

methyl 1 5-desmethoxy- 1 5-methyl-FK-520 

hydrogen 1 3, 1 5-didesmethoxy- 1 3-methyl-FK-520 

methoxy 1 3-desmethoxy- 1 3-methyl-FK-520 

methyl 13,1 5-didesmethoxy- 1 3, 1 5-dimethyl-FK-520 



FK-520 



FK-520 



FK-520 



FK-520 



FK-520 



FK-520 



methyl 
methyl 



FK-520 



Example 5 



Replacement of Methoxyl with Ethyl at C-13 and/or C-15 of FK-506 and FK-520 
The present invention also provides novel FK-506 and FK-520 derivative 
compounds in which the methoxy groups at either or both the C-13 and C-15 positions 
are instead ethyl groups. These compounds are produced by novel PKS enzymes of the 
invention in which the AT domains of modules 8 and/or 7 are converted to ethylmalonyl 
specific AT domains by modification of the PKS gene that encodes the module. 
Ethylmalonyl specific AT domain coding sequences can be obtained from, for example, 
the FK-520 PKS genes, the niddamycin PKS genes, and the tylosin PKS genes. The 
novel PKS genes of the invention include not only those in which either or both of the 
AT domains of modules 7 and 8 have been converted to ethylmalonyl specific AT 
domains but also those in which one of the modules is converted to an ethylmalonyl 
specific AT domain and the other is converted to a malonyl specific or a methylmalonyl 
specific AT domain. 



The compounds described in Examples 1 - 4, inclusive have immunosuppressant 
activity and can be employed as immunosuppressants in a manner and in formulations 
similar to those employed for FK-506. The compounds of the invention are generally 
effective for the prevention of organ rejection in patients receiving organ transplants and 



Example 6 
Neurotrophic Compounds 
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in particular can be used for immunosuppression following orthotopic liver 
transplantation. These compounds also have pharmacokinetic properties and metabolism 
that are more advantageous for certain applications relative to those of FK-506 or FK- 
520. These compounds are also neurotrophic; however, for use as neurotrophins, it is 
desirable to modify the compounds to diminish or abolish their immunosuppressant 
activity. This can be readily accomplished by hydroxy lating the compounds at the C- 18 
position using established chemical methodology or novel FK-520 PKS genes provided 
by the present invention. 

Thus, in one aspect, the present invention provides a method for stimulating nerve 
growth that comprises administering a therapeutically effective dose of 18-hydroxy-FK- 
520. In another embodiment, the compound administered is a C-18,20-dihydroxy-FK-520 
derivative. In another embodiment, the compound administered is a C-13-desmethoxy 
and/or C-15-desmethoxy 18-hydroxy-FK-520 derivative. In another embodiment, the 
compound administered is a C-13-desmethoxy and/or C-15-desmethoxy 18,20- 
dihydroxy-FK-520 derivative; In other embodiments, the compounds are the 
corresponding analogs of FK-506. The 18-hydroxy compounds of the invention can be 
prepared chemically, as described in U.S. Patent No. 5,189,042, incorporated herein by 
reference, or by fermentation of a recombinant host cell provided by the present invention 
that expresses a recombinant PKS in which the module 5 DH domain has been deleted or 
rendered non-functional. 

The chemical methodology is as follows. A compound of the invention (-200 mg) 
is dissolved in 3 mL of dry methylene chloride and added to 45 yL of 2,6-lutidine, and 
the mixture stirred at room temperature. After 10 minutes, tert-butyldimethylsilyl 
trifluoromethanesulfonate (64 |iL) is added by syringe. After 15 minutes, the reaction 
mixture is diluted with ethyl acetate, washed with saturated bicarbonate, washed with 
brine, and the organic phase dried over magnesium sulfate. Removal of solvent in vacuo 
and flash chromatography on silica gel (ethyl acetate: hexane (1 :2) plus 1% methanol) 
gives the protected compound, which is dissolved in 95% ethanol (2.2 mL) and to which 
is added 53 jaL of pyridine, followed by selenium dioxide (58 mg). The flask is fitted 
with a water condenser and heated to 70°C on a mantle. After 20 hours, the mixture is 
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cooled to room temperature, filtered through diatomaceous earth, and the filtrate poured 
into a saturated sodium bicarbonate solution. This is extracted with ethyl acetate, and the 
organic phase is washed with brine and dried over magnesium sulfate. The solution is 
concentrated and purified by flash chromatography on silica gel (ethyl acetate: hexane 
5 (1 :2) plus 1% methanol) to give the protected 18-hydroxy compound. This compound is 
dissolved in acetonitrile and treated with aqueous HF to remove the protecting groups. 
After dilution with ethyl acetate, the mixture is washed with saturated bicarbonate and 
brine, dried over magnesium sulfate, filtered, and evaporated to yield the 18-hydroxy 
compound. Thus, the present invention provides the C-18-hydroxyl derivatives of the 
10 compounds described in Examples 1 - 4. 
O Those of skill in the art will recognize that other suitable chemical procedures can 

k £ be used to prepare the novel 18-hydroxy compounds of the invention. See, e.g., Kawai et 

q ai, Jan. 1993, Structure-activity profiles of macrolactam immunosuppressant FK-506 

W ■ analogues, FEES Letters 3 1 6(2): 107-113, incorporated herein by reference. These 

gi 15 methods can be used to prepare both the C 1 8-[5]-OH and C 1 8-[i2]-OH enantiomers, with 
q the R enantiomer showing a somewhat lower IC50, which may be preferred in some 

2P applications. See Kawai et al. 9 supra. Another preferred protocol is described in Umbreit 

4 and Sharpless, 1977, JACS 99(16): 1526-28, although it may be preferable to use 30 

equivalents each of Se0 2 and t-BuOOH rather than the 0.02 and 3-4 equivalents, 
20 respectively, described in that reference. 

All scientific and patent publications referenced herein are hereby incorporated by 
reference. The invention having now been described by way of written description and 
example, those of skill in the art will recognize that the invention can be practiced in a 
variety of embodiments, that the foregoing description and example is for purposes of 
25 illustration and not limitation of the following claims. 
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